For more details please refer to our Github: FlagEmbedding. **BGE-Multilingual-Gemma2** is a LLM-based multilingual embedding model. It is trained on a diverse range of languages and tasks based on google/gemma-2-9b. BGE-Multilingual-Gemma2 primarily demonstrates the following advancements: - Diverse training data: The model's training data spans a broad range of languages, including English, Chinese, Japanese, Korean, French, and more.Additionally, the data covers a variety of task types, such as retrieval, classification, and clustering. - Outstanding performance: The model exhibits state-of-the-art (SOTA) results on multilingual benchmarks like MIRACL, MTEB-pl, and MTEB-fr. It also achieves excellent performance on other major evaluations, including MTEB, C-MTEB and AIR-Bench. ## 📑 Open-source Plan - [x] Checkpoint - [ ] Training Data We will release the training data of **BGE-Multilingual-Gemma2** in the future. ## Usage ### Using FlagEmbedding By default, FlagLLMModel will use all available GPUs when encoding. Please set to select specific GPUs. You also can set to make all GPUs unavailable. ### Using Sentence Transformers ### Using HuggingFace Transformers ## Evaluation exhibits **state-of-the-art (SOTA) results on benchmarks like MIRACL, MTEB-pl, and MTEB-fr**. It also achieves excellent performance on other major evaluations, including MTEB, C-MTEB and AIR-Bench. - **MIRACL** nDCG@10: Recall@100: - **MTEB-fr/pl** - **MTEB** - **BEIR** - **C-MTEB** - **AIR-Bench** Long-Doc (en, Recall@10): QA (en&zh, nDCG@10): ## Model List is short for . | Model | Language | | Description | query instruction for retrieval [1] | | :----------------------------------------------------------- | :-----------------: | :----------------------------------------------------------: | :----------------------------------------------------------: | :----------------------------------------------------------: | | BAAI/bge-multilingual-gemma2 | Multilingual | - | A LLM-based multilingual embedding model, trained on a diverse range of languages and tasks. | | BAAI/bge-en-icl | English | - | A LLM-based dense retriever with in-context learning capabilities can fully leverage the model's potential based on a few shot examples(4096 tokens) | Provide instructions and few-shot examples freely based on the given task. | | BAAI/bge-m3 | Multilingual | Inference Fine-tune | Multi-Functionality(dense retrieval, sparse retrieval, multi-vector(colbert)), Multi-Linguality, and Multi-Granularity(8192 tokens) | | | BAAI/llm-embedder | English | Inference Fine-tune | a unified embedding model to support diverse retrieval augmentation needs for LLMs | See README | | BAAI/bge-reranker-large | Chinese and English | Inference Fine-tune | a cross-encoder model which is more accurate but less efficient [2] | | | BAAI/bge-reranker-base | Chinese and English | Inference Fine-tune | a cross-encoder model which is more accurate but less efficient [2] | | | BAAI/bge-large-en-v1.5 | English | Inference Fine-tune | version 1.5 with more reasonable similarity distribution | | | BAAI/bge-base-en-v1.5 | English | Inference Fine-tune | version 1.5 with more reasonable similarity distribution | | | BAAI/bge-small-en-v1.5 | English | Inference Fine-tune | version 1.5 with more reasonable similarity distribution | | | BAAI/bge-large-zh-v1.5 | Chinese | Inference Fine-tune | version 1.5 with more reasonable similarity distribution | | | BAAI/bge-base-zh-v1.5 | Chinese | Inference Fine-tune | version 1.5 with more reasonable similarity distribution | | | BAAI/bge-small-zh-v1.5 | Chinese | Inference Fine-tune | version 1.5 with more reasonable similarity distribution | | | BAAI/bge-large-en | English | Inference Fine-tune | :trophy: rank **1st** in MTEB leaderboard | | | BAAI/bge-base-en | English | Inference Fine-tune | a base-scale model but with similar ability to | | | BAAI/bge-small-en | English | Inference Fine-tune | a small-scale model but with competitive performance | | | BAAI/bge-large-zh | Chinese | Inference Fine-tune | :trophy: rank **1st** in C-MTEB benchmark | | | BAAI/bge-base-zh | Chinese | Inference Fine-tune | a base-scale model but with similar ability to | | | BAAI/bge-small-zh | Chinese | Inference Fine-tune | a small-scale model but with competitive performance | | ## Citation If you find this repository useful, please consider giving a star :star: and citation",
+ "model_explanation_gemini": "A multilingual model for feature extraction and sentence similarity tasks, optimized for retrieval performance across various datasets.\n\nFeatures: \n- Multilingual capability \n- Feature extraction \n- Sentence similarity measurement \n- Retrieval task optimization \n\nComparison: \nThe model shows competitive retrieval performance (measured by NDCG, MAP, precision, and recall scores) across multiple datasets (NFCorpus, MSMARCO, FiQA2018) compared to other models in similar tasks."
+}
\ No newline at end of file
diff --git a/model_data_json/BAAI_llm-embedder.json b/model_data_json/BAAI_llm-embedder.json
new file mode 100644
index 0000000000000000000000000000000000000000..3c8cb601a495efe412ba55804d31edd775b61a9a
--- /dev/null
+++ b/model_data_json/BAAI_llm-embedder.json
@@ -0,0 +1,19 @@
+{
+ "model_id": "BAAI/llm-embedder",
+ "downloads": 82703,
+ "tags": [
+ "transformers",
+ "pytorch",
+ "safetensors",
+ "bert",
+ "feature-extraction",
+ "arxiv:2310.07554",
+ "arxiv:2309.07597",
+ "license:mit",
+ "text-embeddings-inference",
+ "endpoints_compatible",
+ "region:us"
+ ],
+ "description": "--- license: mit ---
More details please refer to our Github: FlagEmbedding. English | 中文 **Hiring:** We're seeking experienced NLP researchers and intern students focusing on dense retrieval and retrieval-augmented LLMs. If you're interested, please feel free to reach out to us via email at zhengliu1026@gmail.com. FlagEmbedding can map any text to a low-dimensional dense vector, which can be used for tasks like retrieval, classification, clustering, and semantic search. And it can also be used in vector databases for LLMs. ************* 🌟**Updates**🌟 ************* - 10/12/2023: Release LLM-Embedder, a unified embedding model to support diverse retrieval augmentation needs for LLMs. Paper :fire: - 09/15/2023: The technical report of BGE has been released - 09/15/2023: The massive training data of BGE has been released - 09/12/2023: New models: - **New reranker model**: release cross-encoder models and , which are more powerful than embedding model. We recommend to use/fine-tune them to re-rank top-k documents returned by embedding models. - **update embedding model**: release embedding model to alleviate the issue of the similarity distribution, and enhance its retrieval ability without instruction. More - 09/07/2023: Update fine-tune code: Add script to mine hard negatives and support adding instruction during fine-tuning. - 08/09/2023: BGE Models are integrated into **Langchain**, you can use it like this; C-MTEB **leaderboard** is available. - 08/05/2023: Release base-scale and small-scale models, **best performance among the models of the same size 🤗** - 08/02/2023: Release (short for BAAI General Embedding) Models, **rank 1st on MTEB and C-MTEB benchmark!** :tada: :tada: - 08/01/2023: We release the Chinese Massive Text Embedding Benchmark (**C-MTEB**), consisting of 31 test dataset. ## Model List is short for . | Model | Language | | Description | query instruction for retrieval [1] | |:-------------------------------|:--------:| :--------:| :--------:|:--------:| | BAAI/llm-embedder | English | Inference Fine-tune | a unified embedding model to support diverse retrieval augmentation needs for LLMs | See README | | BAAI/bge-reranker-large | Chinese and English | Inference Fine-tune | a cross-encoder model which is more accurate but less efficient [2] | | | BAAI/bge-reranker-base | Chinese and English | Inference Fine-tune | a cross-encoder model which is more accurate but less efficient [2] | | | BAAI/bge-large-en-v1.5 | English | Inference Fine-tune | version 1.5 with more reasonable similarity distribution | | | BAAI/bge-base-en-v1.5 | English | Inference Fine-tune | version 1.5 with more reasonable similarity distribution | | | BAAI/bge-small-en-v1.5 | English | Inference Fine-tune | version 1.5 with more reasonable similarity distribution | | | BAAI/bge-large-zh-v1.5 | Chinese | Inference Fine-tune | version 1.5 with more reasonable similarity distribution | | | BAAI/bge-base-zh-v1.5 | Chinese | Inference Fine-tune | version 1.5 with more reasonable similarity distribution | | | BAAI/bge-small-zh-v1.5 | Chinese | Inference Fine-tune | version 1.5 with more reasonable similarity distribution | | | BAAI/bge-large-en | English | Inference Fine-tune | :trophy: rank **1st** in MTEB leaderboard | | | BAAI/bge-base-en | English | Inference Fine-tune | a base-scale model but with similar ability to | | | BAAI/bge-small-en | English | Inference Fine-tune |a small-scale model but with competitive performance | | | BAAI/bge-large-zh | Chinese | Inference Fine-tune | :trophy: rank **1st** in C-MTEB benchmark | | | BAAI/bge-base-zh | Chinese | Inference Fine-tune | a base-scale model but with similar ability to | | | BAAI/bge-small-zh | Chinese | Inference Fine-tune | a small-scale model but with competitive performance | | [1\\]: If you need to search the relevant passages in a query, we suggest to add the instruction to the query; in other cases, no instruction is needed, just use the original query directly. In all cases, **no instruction** needs to be added to passages. [2\\]: Different from the embedding model, reranker uses question and document as input and directly output similarity instead of embedding. To balance the accuracy and time cost, cross-encoder is widely used to re-rank top-k documents retrieved by other simple models. For example, use bge embedding model to retrieve top 100 relevant documents, and then use bge reranker to re-rank the top 100 documents to get the final top-3 results. All models have been uploaded to Huggingface Hub, and you can see them at If you cannot open the Huggingface Hub, you can also download the models at . ## Frequently asked questions **1. How to fine-tune bge embedding model?** Following this example to prepare data and fine-tune your model. Some suggestions: - Mine hard negatives following this example, which can improve the retrieval performance. - In general, larger hyper-parameter brings better performance. You can expand it by enabling , (df_config.json can refer to ds_config.json, , etc. - If you pre-train bge on your data, the pre-trained model cannot be directly used to calculate similarity, and it must be fine-tuned with contrastive learning before computing similarity. - If the accuracy of the fine-tuned model is still not high, it is recommended to use/fine-tune the cross-encoder model (bge-reranker) to re-rank top-k results. Hard negatives also are needed to fine-tune reranker. 2. The similarity score between two dissimilar sentences is higher than 0.5 **Suggest to use bge v1.5, which alleviates the issue of the similarity distribution.** Since we finetune the models by contrastive learning with a temperature of 0.01, the similarity distribution of the current BGE model is about in the interval \\[0.6, 1\\]. So a similarity score greater than 0.5 does not indicate that the two sentences are similar. For downstream tasks, such as passage retrieval or semantic similarity, **what matters is the relative order of the scores, not the absolute value.** If you need to filter similar sentences based on a similarity threshold, please select an appropriate similarity threshold based on the similarity distribution on your data (such as 0.8, 0.85, or even 0.9). 3. When does the query instruction need to be used For the , we improve its retrieval ability when not using instruction. No instruction only has a slight degradation in retrieval performance compared with using instruction. So you can generate embedding without instruction in all cases for convenience. For a retrieval task that uses short queries to find long related documents, it is recommended to add instructions for these short queries. **The best method to decide whether to add instructions for queries is choosing the setting that achieves better performance on your task.** In all cases, the documents/passages do not need to add the instruction. ## Usage ### Usage for Embedding Model Here are some examples of using models with FlagEmbedding, Sentence-Transformers, Langchain, or Huggingface Transformers. #### Using FlagEmbedding If it doesn't work for you, you can see FlagEmbedding for more methods to install FlagEmbedding. For the value of the argument , see Model List. By default, FlagModel will use all available GPUs when encoding. Please set to select specific GPUs. You also can set to make all GPUs unavailable. #### Using Sentence-Transformers You can also use the models with sentence-transformers: For s2p(short query to long passage) retrieval task, each short query should start with an instruction (instructions see Model List). But the instruction is not needed for passages. #### Using Langchain You can use in langchain like this: #### Using HuggingFace Transformers With the transformers package, you can use the model like this: First, you pass your input through the transformer model, then you select the last hidden state of the first token (i.e., [CLS]) as the sentence embedding. ### Usage for Reranker Different from embedding model, reranker uses question and document as input and directly output similarity instead of embedding. You can get a relevance score by inputting query and passage to the reranker. The reranker is optimized based cross-entropy loss, so the relevance score is not bounded to a specific range. #### Using FlagEmbedding Get relevance scores (higher scores indicate more relevance): #### Using Huggingface transformers ## Evaluation models achieve **state-of-the-art performance on both MTEB and C-MTEB leaderboard!** For more details and evaluation tools see our scripts. - **MTEB**: | Model Name | Dimension | Sequence Length | Average (56) | Retrieval (15) |Clustering (11) | Pair Classification (3) | Reranking (4) | STS (10) | Summarization (1) | Classification (12) | |:----:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:| | BAAI/bge-large-en-v1.5 | 1024 | 512 | **64.23** | **54.29** | 46.08 | 87.12 | 60.03 | 83.11 | 31.61 | 75.97 | | BAAI/bge-base-en-v1.5 | 768 | 512 | 63.55 | 53.25 | 45.77 | 86.55 | 58.86 | 82.4 | 31.07 | 75.53 | | BAAI/bge-small-en-v1.5 | 384 | 512 | 62.17 |51.68 | 43.82 | 84.92 | 58.36 | 81.59 | 30.12 | 74.14 | | bge-large-en | 1024 | 512 | 63.98 | 53.9 | 46.98 | 85.8 | 59.48 | 81.56 | 32.06 | 76.21 | | bge-base-en | 768 | 512 | 63.36 | 53.0 | 46.32 | 85.86 | 58.7 | 81.84 | 29.27 | 75.27 | | gte-large | 1024 | 512 | 63.13 | 52.22 | 46.84 | 85.00 | 59.13 | 83.35 | 31.66 | 73.33 | | gte-base | 768 | 512 | 62.39 | 51.14 | 46.2 | 84.57 | 58.61 | 82.3 | 31.17 | 73.01 | | e5-large-v2 | 1024| 512 | 62.25 | 50.56 | 44.49 | 86.03 | 56.61 | 82.05 | 30.19 | 75.24 | | bge-small-en | 384 | 512 | 62.11 | 51.82 | 44.31 | 83.78 | 57.97 | 80.72 | 30.53 | 74.37 | | instructor-xl | 768 | 512 | 61.79 | 49.26 | 44.74 | 86.62 | 57.29 | 83.06 | 32.32 | 61.79 | | e5-base-v2 | 768 | 512 | 61.5 | 50.29 | 43.80 | 85.73 | 55.91 | 81.05 | 30.28 | 73.84 | | gte-small | 384 | 512 | 61.36 | 49.46 | 44.89 | 83.54 | 57.7 | 82.07 | 30.42 | 72.31 | | text-embedding-ada-002 | 1536 | 8192 | 60.99 | 49.25 | 45.9 | 84.89 | 56.32 | 80.97 | 30.8 | 70.93 | | e5-small-v2 | 384 | 512 | 59.93 | 49.04 | 39.92 | 84.67 | 54.32 | 80.39 | 31.16 | 72.94 | | sentence-t5-xxl | 768 | 512 | 59.51 | 42.24 | 43.72 | 85.06 | 56.42 | 82.63 | 30.08 | 73.42 | | all-mpnet-base-v2 | 768 | 514 | 57.78 | 43.81 | 43.69 | 83.04 | 59.36 | 80.28 | 27.49 | 65.07 | | sgpt-bloom-7b1-msmarco | 4096 | 2048 | 57.59 | 48.22 | 38.93 | 81.9 | 55.65 | 77.74 | 33.6 | 66.19 | - **C-MTEB**: We create the benchmark C-MTEB for Chinese text embedding which consists of 31 datasets from 6 tasks. Please refer to C_MTEB for a detailed introduction. | Model | Embedding dimension | Avg | Retrieval | STS | PairClassification | Classification | Reranking | Clustering | |:-------------------------------|:--------:|:--------:|:--------:|:--------:|:--------:|:--------:|:--------:|:--------:| | **BAAI/bge-large-zh-v1.5** | 1024 | **64.53** | 70.46 | 56.25 | 81.6 | 69.13 | 65.84 | 48.99 | | BAAI/bge-base-zh-v1.5 | 768 | 63.13 | 69.49 | 53.72 | 79.75 | 68.07 | 65.39 | 47.53 | | BAAI/bge-small-zh-v1.5 | 512 | 57.82 | 61.77 | 49.11 | 70.41 | 63.96 | 60.92 | 44.18 | | BAAI/bge-large-zh | 1024 | 64.20 | 71.53 | 54.98 | 78.94 | 68.32 | 65.11 | 48.39 | | bge-large-zh-noinstruct | 1024 | 63.53 | 70.55 | 53 | 76.77 | 68.58 | 64.91 | 50.01 | | BAAI/bge-base-zh | 768 | 62.96 | 69.53 | 54.12 | 77.5 | 67.07 | 64.91 | 47.63 | | multilingual-e5-large | 1024 | 58.79 | 63.66 | 48.44 | 69.89 | 67.34 | 56.00 | 48.23 | | BAAI/bge-small-zh | 512 | 58.27 | 63.07 | 49.45 | 70.35 | 63.64 | 61.48 | 45.09 | | m3e-base | 768 | 57.10 | 56.91 | 50.47 | 63.99 | 67.52 | 59.34 | 47.68 | | m3e-large | 1024 | 57.05 | 54.75 | 50.42 | 64.3 | 68.2 | 59.66 | 48.88 | | multilingual-e5-base | 768 | 55.48 | 61.63 | 46.49 | 67.07 | 65.35 | 54.35 | 40.68 | | multilingual-e5-small | 384 | 55.38 | 59.95 | 45.27 | 66.45 | 65.85 | 53.86 | 45.26 | | text-embedding-ada-002(OpenAI) | 1536 | 53.02 | 52.0 | 43.35 | 69.56 | 64.31 | 54.28 | 45.68 | | luotuo | 1024 | 49.37 | 44.4 | 42.78 | 66.62 | 61 | 49.25 | 44.39 | | text2vec-base | 768 | 47.63 | 38.79 | 43.41 | 67.41 | 62.19 | 49.45 | 37.66 | | text2vec-large | 1024 | 47.36 | 41.94 | 44.97 | 70.86 | 60.66 | 49.16 | 30.02 | - **Reranking**: See C_MTEB for evaluation script. | Model | T2Reranking | T2RerankingZh2En\\* | T2RerankingEn2Zh\\* | MMarcoReranking | CMedQAv1 | CMedQAv2 | Avg | |:-------------------------------|:--------:|:--------:|:--------:|:--------:|:--------:|:--------:|:--------:| | text2vec-base-multilingual | 64.66 | 62.94 | 62.51 | 14.37 | 48.46 | 48.6 | 50.26 | | multilingual-e5-small | 65.62 | 60.94 | 56.41 | 29.91 | 67.26 | 66.54 | 57.78 | | multilingual-e5-large | 64.55 | 61.61 | 54.28 | 28.6 | 67.42 | 67.92 | 57.4 | | multilingual-e5-base | 64.21 | 62.13 | 54.68 | 29.5 | 66.23 | 66.98 | 57.29 | | m3e-base | 66.03 | 62.74 | 56.07 | 17.51 | 77.05 | 76.76 | 59.36 | | m3e-large | 66.13 | 62.72 | 56.1 | 16.46 | 77.76 | 78.27 | 59.57 | | bge-base-zh-v1.5 | 66.49 | 63.25 | 57.02 | 29.74 | 80.47 | 84.88 | 63.64 | | bge-large-zh-v1.5 | 65.74 | 63.39 | 57.03 | 28.74 | 83.45 | 85.44 | 63.97 | | BAAI/bge-reranker-base | 67.28 | 63.95 | 60.45 | 35.46 | 81.26 | 84.1 | 65.42 | | BAAI/bge-reranker-large | 67.6 | 64.03 | 61.44 | 37.16 | 82.15 | 84.18 | 66.09 | \\* : T2RerankingZh2En and T2RerankingEn2Zh are cross-language retrieval tasks ## Train ### BAAI Embedding We pre-train the models using retromae and train them on large-scale pair data using contrastive learning. **You can fine-tune the embedding model on your data following our examples.** We also provide a pre-train example. Note that the goal of pre-training is to reconstruct the text, and the pre-trained model cannot be used for similarity calculation directly, it needs to be fine-tuned. For more training details for bge see baai_general_embedding. ### BGE Reranker Cross-encoder will perform full-attention over the input pair, which is more accurate than embedding model (i.e., bi-encoder) but more time-consuming than embedding model. Therefore, it can be used to re-rank the top-k documents returned by embedding model. We train the cross-encoder on a multilingual pair data, The data format is the same as embedding model, so you can fine-tune it easily following our example. For more details please refer to ./FlagEmbedding/reranker/README.md ### Our Contributors: ## Contact If you have any question or suggestion related to this project, feel free to open an issue or pull request. You also can email Shitao Xiao(stxiao@baai.ac.cn) and Zheng Liu(liuzheng@baai.ac.cn). ## Citation If you find this repository useful, please consider giving a star :star: and citation ## License FlagEmbedding is licensed under the MIT License. The released models can be used for commercial purposes free of charge.",
+ "model_explanation_gemini": "\"BAAI_llm-embedder is a unified embedding model that maps text to dense vectors for retrieval-augmented tasks in LLMs, supporting retrieval, classification, clustering, and semantic search.\"\n\nFeatures: \n- Generates low-dimensional dense vectors from text \n- Designed for retrieval augmentation in LLMs \n- Supports tasks like retrieval, classification, clustering, and semantic search \n- Compatible with vector databases for LLMs \n\nComparison: \nThe model is part of the FlagEmbedding"
+}
\ No newline at end of file
diff --git a/model_data_json/EleutherAI_pythia-1b.json b/model_data_json/EleutherAI_pythia-1b.json
new file mode 100644
index 0000000000000000000000000000000000000000..c148d7955969b873bd0546d465e4e6df2992f65b
--- /dev/null
+++ b/model_data_json/EleutherAI_pythia-1b.json
@@ -0,0 +1,25 @@
+{
+ "model_id": "EleutherAI/pythia-1b",
+ "downloads": 83131,
+ "tags": [
+ "transformers",
+ "pytorch",
+ "safetensors",
+ "gpt_neox",
+ "text-generation",
+ "causal-lm",
+ "pythia",
+ "en",
+ "dataset:the_pile",
+ "arxiv:2304.01373",
+ "arxiv:2101.00027",
+ "arxiv:2201.07311",
+ "license:apache-2.0",
+ "autotrain_compatible",
+ "text-generation-inference",
+ "endpoints_compatible",
+ "region:us"
+ ],
+ "description": "--- language: - en tags: - pytorch - causal-lm - pythia license: apache-2.0 datasets: - the_pile --- The *Pythia Scaling Suite* is a collection of models developed to facilitate interpretability research (see paper). It contains two sets of eight models of sizes 70M, 160M, 410M, 1B, 1.4B, 2.8B, 6.9B, and 12B. For each size, there are two models: one trained on the Pile, and one trained on the Pile after the dataset has been globally deduplicated. All 8 model sizes are trained on the exact same data, in the exact same order. We also provide 154 intermediate checkpoints per model, hosted on Hugging Face as branches. The Pythia model suite was deliberately designed to promote scientific research on large language models, especially interpretability research. Despite not centering downstream performance as a design goal, we find the models match or exceed the performance of similar and same-sized models, such as those in the OPT and GPT-Neo suites. Details on previous early release and naming convention. Previously, we released an early version of the Pythia suite to the public. However, we decided to retrain the model suite to address a few hyperparameter discrepancies. This model card lists the changes; see appendix B in the Pythia paper for further discussion. We found no difference in benchmark performance between the two Pythia versions. The old models are still available, but we suggest the retrained suite if you are just starting to use Pythia. **This is the current release.** Please note that all models in the *Pythia* suite were renamed in January 2023. For clarity, a table comparing the old and new names is provided in this model card, together with exact parameter counts. # Pythia-1B ## Model Details - Developed by: EleutherAI - Model type: Transformer-based Language Model - Language: English - Learn more: Pythia's GitHub repository for training procedure, config files, and details on how to use. See paper for more evals and implementation details. - Library: GPT-NeoX - License: Apache 2.0 - Contact: to ask questions about this model, join the EleutherAI Discord, and post them in . Please read the existing *Pythia* documentation before asking about it in the EleutherAI Discord. For general correspondence: contact@eleuther. ai. | Pythia model | Non-Embedding Params | Layers | Model Dim | Heads | Batch Size | Learning Rate | Equivalent Models | | -----------: | -------------------: | :----: | :-------: | :---: | :--------: | :-------------------: | :--------------------: | | 70M | 18,915,328 | 6 | 512 | 8 | 2M | 1.0 x 10-3 | — | | 160M | 85,056,000 | 12 | 768 | 12 | 2M | 6.0 x 10-4 | GPT-Neo 125M, OPT-125M | | 410M | 302,311,424 | 24 | 1024 | 16 | 2M | 3.0 x 10-4 | OPT-350M | | 1.0B | 805,736,448 | 16 | 2048 | 8 | 2M | 3.0 x 10-4 | — | | 1.4B | 1,208,602,624 | 24 | 2048 | 16 | 2M | 2.0 x 10-4 | GPT-Neo 1.3B, OPT-1.3B | | 2.8B | 2,517,652,480 | 32 | 2560 | 32 | 2M | 1.6 x 10-4 | GPT-Neo 2.7B, OPT-2.7B | | 6.9B | 6,444,163,072 | 32 | 4096 | 32 | 2M | 1.2 x 10-4 | OPT-6.7B | | 12B | 11,327,027,200 | 36 | 5120 | 40 | 2M | 1.2 x 10-4 | — | Engineering details for the Pythia Suite. Deduped and non-deduped models of a given size have the same hyperparameters. “Equivalent” models have exactly the same architecture, and the same number of non-embedding parameters. ## Uses and Limitations ### Intended Use The primary intended use of Pythia is research on the behavior, functionality, and limitations of large language models. This suite is intended to provide a controlled setting for performing scientific experiments. We also provide 154 checkpoints per model: initial , 10 log-spaced checkpoints , and 143 evenly-spaced checkpoints from to . These checkpoints are hosted on Hugging Face as branches. Note that branch corresponds exactly to the model checkpoint on the branch of each model. You may also further fine-tune and adapt Pythia-1B for deployment, as long as your use is in accordance with the Apache 2.0 license. Pythia models work with the Hugging Face Transformers Library. If you decide to use pre-trained Pythia-1B as a basis for your fine-tuned model, please conduct your own risk and bias assessment. ### Out-of-scope use The Pythia Suite is **not** intended for deployment. It is not a in itself a product and cannot be used for human-facing interactions. For example, the model may generate harmful or offensive text. Please evaluate the risks associated with your particular use case. Pythia models are English-language only, and are not suitable for translation or generating text in other languages. Pythia-1B has not been fine-tuned for downstream contexts in which language models are commonly deployed, such as writing genre prose, or commercial chatbots. This means Pythia-1B will **not** respond to a given prompt the way a product like ChatGPT does. This is because, unlike this model, ChatGPT was fine-tuned using methods such as Reinforcement Learning from Human Feedback (RLHF) to better “follow” human instructions. ### Limitations and biases The core functionality of a large language model is to take a string of text and predict the next token. The token used by the model need not produce the most “accurate” text. Never rely on Pythia-1B to produce factually accurate output. This model was trained on the Pile, a dataset known to contain profanity and texts that are lewd or otherwise offensive. See Section 6 of the Pile paper for a discussion of documented biases with regards to gender, religion, and race. Pythia-1B may produce socially unacceptable or undesirable text, *even if* the prompt itself does not include anything explicitly offensive. If you plan on using text generated through, for example, the Hosted Inference API, we recommend having a human curate the outputs of this language model before presenting it to other people. Please inform your audience that the text was generated by Pythia-1B. ### Quickstart Pythia models can be loaded and used via the following code, demonstrated here for the third checkpoint: Revision/branch corresponds exactly to the model checkpoint on the branch of each model. For more information on how to use all Pythia models, see documentation on GitHub. ## Training ### Training data The Pile is a 825GiB general-purpose dataset in English. It was created by EleutherAI specifically for training large language models. It contains texts from 22 diverse sources, roughly broken down into five categories: academic writing (e.g. arXiv), internet (e.g. CommonCrawl), prose (e.g. Project Gutenberg), dialogue (e.g. YouTube subtitles), and miscellaneous (e.g. GitHub, Enron Emails). See the Pile paper for a breakdown of all data sources, methodology, and a discussion of ethical implications. Consult the datasheet for more detailed documentation about the Pile and its component datasets. The Pile can be downloaded from the official website, or from a community mirror. The Pile was **not** deduplicated before being used to train Pythia-1B. ### Training procedure All models were trained on the exact same data, in the exact same order. Each model saw 299,892,736,000 tokens during training, and 143 checkpoints for each model are saved every 2,097,152,000 tokens, spaced evenly throughout training, from to (which is the same as ). In addition, we also provide frequent early checkpoints: and . This corresponds to training for just under 1 epoch on the Pile for non-deduplicated models, and about 1.5 epochs on the deduplicated Pile. All *Pythia* models trained for 143000 steps at a batch size of 2M (2,097,152 tokens). See GitHub for more details on training procedure, including how to reproduce it. Pythia uses the same tokenizer as GPT-NeoX- 20B. ## Evaluations All 16 *Pythia* models were evaluated using the LM Evaluation Harness. You can access the results by model and step at in the GitHub repository. Expand the sections below to see plots of evaluation results for all Pythia and Pythia-deduped models compared with OPT and BLOOM. LAMBADA – OpenAIPhysical Interaction: Question Answering (PIQA)WinoGrandeAI2 Reasoning Challenge—Easy SetSciQ ## Changelog This section compares differences between previously released Pythia v0 and the current models. See Appendix B of the Pythia paper for further discussion of these changes and the motivation behind them. We found that retraining Pythia had no impact on benchmark performance. - All model sizes are now trained with uniform batch size of 2M tokens. Previously, the models of size 160M, 410M, and 1.4B parameters were trained with batch sizes of 4M tokens. - We added checkpoints at initialization (step 0) and steps {1,2,4,8,16,32,64, 128,256,512} in addition to every 1000 training steps. - Flash Attention was used in the new retrained suite. - We remedied a minor inconsistency that existed in the original suite: all models of size 2.8B parameters or smaller had a learning rate (LR) schedule which decayed to a minimum LR of 10% the starting LR rate, but the 6.9B and 12B models all used an LR schedule which decayed to a minimum LR of 0. In the redone training runs, we rectified this inconsistency: all models now were trained with LR decaying to a minimum of 0.1× their maximum LR. ### Naming convention and parameter count *Pythia* models were renamed in January 2023. It is possible that the old naming convention still persists in some documentation by accident. The current naming convention (70M, 160M, etc.) is based on total parameter count. | current Pythia suffix | old suffix | total params | non-embedding params | | --------------------: | ---------: | -------------: | -------------------: | | 70M | 19M | 70,426,624 | 18,915,328 | | 160M | 125M | 162,322,944 | 85,056,000 | | 410M | 350M | 405,334,016 | 302,311,424 | | 1B | 800M | 1,011,781,632 | 805,736,448 | | 1.4B | 1.3B | 1,414,647,808 | 1,208,602,624 | | 2.8B | 2.7B | 2,775,208,960 | 2,517,652,480 | | 6.9B | 6.7B | 6,857,302,016 | 6,444,163,072 | | 12B | 13B | 11,846,072,320 | 11,327,027,200 | ",
+ "model_explanation_gemini": "A 1-billion-parameter English language model designed for interpretability research, trained on the Pile dataset with reproducible checkpoints, matching or exceeding similar-sized models like OPT and GPT-Neo.\n\n**Features:** \n- Transformer-based causal language model \n- Part of the Pythia Scaling Suite (1B parameters) \n- Trained on the Pile dataset (with deduplicated variant available) \n- 154 intermediate checkpoints provided \n- Matches/exceeds performance of comparable models"
+}
\ No newline at end of file
diff --git a/model_data_json/FacebookAI_xlm-roberta-large-finetuned-conll03-english.json b/model_data_json/FacebookAI_xlm-roberta-large-finetuned-conll03-english.json
new file mode 100644
index 0000000000000000000000000000000000000000..d8cf3390ca286ae01bc2ac977564147fbb209e16
--- /dev/null
+++ b/model_data_json/FacebookAI_xlm-roberta-large-finetuned-conll03-english.json
@@ -0,0 +1,115 @@
+{
+ "model_id": "FacebookAI/xlm-roberta-large-finetuned-conll03-english",
+ "downloads": 79650,
+ "tags": [
+ "transformers",
+ "pytorch",
+ "rust",
+ "onnx",
+ "safetensors",
+ "xlm-roberta",
+ "token-classification",
+ "multilingual",
+ "af",
+ "am",
+ "ar",
+ "as",
+ "az",
+ "be",
+ "bg",
+ "bn",
+ "br",
+ "bs",
+ "ca",
+ "cs",
+ "cy",
+ "da",
+ "de",
+ "el",
+ "en",
+ "eo",
+ "es",
+ "et",
+ "eu",
+ "fa",
+ "fi",
+ "fr",
+ "fy",
+ "ga",
+ "gd",
+ "gl",
+ "gu",
+ "ha",
+ "he",
+ "hi",
+ "hr",
+ "hu",
+ "hy",
+ "id",
+ "is",
+ "it",
+ "ja",
+ "jv",
+ "ka",
+ "kk",
+ "km",
+ "kn",
+ "ko",
+ "ku",
+ "ky",
+ "la",
+ "lo",
+ "lt",
+ "lv",
+ "mg",
+ "mk",
+ "ml",
+ "mn",
+ "mr",
+ "ms",
+ "my",
+ "ne",
+ "nl",
+ "no",
+ "om",
+ "or",
+ "pa",
+ "pl",
+ "ps",
+ "pt",
+ "ro",
+ "ru",
+ "sa",
+ "sd",
+ "si",
+ "sk",
+ "sl",
+ "so",
+ "sq",
+ "sr",
+ "su",
+ "sv",
+ "sw",
+ "ta",
+ "te",
+ "th",
+ "tl",
+ "tr",
+ "ug",
+ "uk",
+ "ur",
+ "uz",
+ "vi",
+ "xh",
+ "yi",
+ "zh",
+ "arxiv:1911.02116",
+ "arxiv:2008.03415",
+ "arxiv:1910.09700",
+ "autotrain_compatible",
+ "endpoints_compatible",
+ "region:us"
+ ],
+ "description": "--- language: - multilingual - af - am - ar - as - az - be - bg - bn - br - bs - ca - cs - cy - da - de - el - en - eo - es - et - eu - fa - fi - fr - fy - ga - gd - gl - gu - ha - he - hi - hr - hu - hy - id - is - it - ja - jv - ka - kk - km - kn - ko - ku - ky - la - lo - lt - lv - mg - mk - ml - mn - mr - ms - my - ne - nl - no - om - or - pa - pl - ps - pt - ro - ru - sa - sd - si - sk - sl - so - sq - sr - su - sv - sw - ta - te - th - tl - tr - ug - uk - ur - uz - vi - xh - yi - zh --- # xlm-roberta-large-finetuned-conll03-english # Table of Contents 1. Model Details 2. Uses 3. Bias, Risks, and Limitations 4. Training 5. Evaluation 6. Environmental Impact 7. Technical Specifications 8. Citation 9. Model Card Authors 10. How To Get Started With the Model # Model Details ## Model Description The XLM-RoBERTa model was proposed in Unsupervised Cross-lingual Representation Learning at Scale by Alexis Conneau, Kartikay Khandelwal, Naman Goyal, Vishrav Chaudhary, Guillaume Wenzek, Francisco Guzmán, Edouard Grave, Myle Ott, Luke Zettlemoyer and Veselin Stoyanov. It is based on Facebook's RoBERTa model released in 2019. It is a large multi-lingual language model, trained on 2.5TB of filtered CommonCrawl data. This model is XLM-RoBERTa-large fine-tuned with the conll2003 dataset in English. - **Developed by:** See associated paper - **Model type:** Multi-lingual language model - **Language(s) (NLP) or Countries (images):** XLM-RoBERTa is a multilingual model trained on 100 different languages; see GitHub Repo for full list; model is fine-tuned on a dataset in English - **License:** More information needed - **Related Models:** RoBERTa, XLM - **Parent Model:** XLM-RoBERTa-large - **Resources for more information:** -GitHub Repo -Associated Paper # Uses ## Direct Use The model is a language model. The model can be used for token classification, a natural language understanding task in which a label is assigned to some tokens in a text. ## Downstream Use Potential downstream use cases include Named Entity Recognition (NER) and Part-of-Speech (PoS) tagging. To learn more about token classification and other potential downstream use cases, see the Hugging Face token classification docs. ## Out-of-Scope Use The model should not be used to intentionally create hostile or alienating environments for people. # Bias, Risks, and Limitations **CONTENT WARNING: Readers should be made aware that language generated by this model may be disturbing or offensive to some and may propagate historical and current stereotypes.** Significant research has explored bias and fairness issues with language models (see, e.g., Sheng et al. (2021) and Bender et al. (2021)). In the context of tasks relevant to this model, Mishra et al. (2020) explore social biases in NER systems for English and find that there is systematic bias in existing NER systems in that they fail to identify named entities from different demographic groups (though this paper did not look at BERT). For example, using a sample sentence from Mishra et al. (2020): ## Recommendations Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. # Training See the following resources for training data and training procedure details: - XLM-RoBERTa-large model card - CoNLL-2003 data card - Associated paper # Evaluation See the associated paper for evaluation details. # Environmental Impact Carbon emissions can be estimated using the Machine Learning Impact calculator presented in Lacoste et al. (2019). - **Hardware Type:** 500 32GB Nvidia V100 GPUs (from the associated paper) - **Hours used:** More information needed - **Cloud Provider:** More information needed - **Compute Region:** More information needed - **Carbon Emitted:** More information needed # Technical Specifications See the associated paper for further details. # Citation **BibTeX:** **APA:** - Conneau, A., Khandelwal, K., Goyal, N., Chaudhary, V., Wenzek, G., Guzmán, F., ... & Stoyanov, V. (2019). Unsupervised cross-lingual representation learning at scale. arXiv preprint arXiv:1911.02116. # Model Card Authors This model card was written by the team at Hugging Face. # How to Get Started with the Model Use the code below to get started with the model. You can use this model directly within a pipeline for NER. Click to expand ",
+ "model_explanation_gemini": "\"Finetuned for English token classification tasks like Named Entity Recognition (NER) using the CoNLL-2003 dataset, based on a multilingual XLM-RoBERTa-large model.\"\n\nFeatures: \n- Multilingual base (XLM-RoBERTa-large) fine-tuned specifically for English \n- Task: Token classification (e.g., NER, POS tagging) \n- Trained on CoNLL-2003 dataset \n\nComparison: \nUnlike generic multilingual models, this variant"
+}
\ No newline at end of file
diff --git a/model_data_json/Helsinki-NLP_opus-mt-en-da.json b/model_data_json/Helsinki-NLP_opus-mt-en-da.json
new file mode 100644
index 0000000000000000000000000000000000000000..2314a4de47cb5f827dc946d20974e3bcdceebf5e
--- /dev/null
+++ b/model_data_json/Helsinki-NLP_opus-mt-en-da.json
@@ -0,0 +1,20 @@
+{
+ "model_id": "Helsinki-NLP/opus-mt-en-da",
+ "downloads": 81331,
+ "tags": [
+ "transformers",
+ "pytorch",
+ "tf",
+ "marian",
+ "text2text-generation",
+ "translation",
+ "en",
+ "da",
+ "license:apache-2.0",
+ "autotrain_compatible",
+ "endpoints_compatible",
+ "region:us"
+ ],
+ "description": "--- tags: - translation license: apache-2.0 --- ### opus-mt-en-da * source languages: en * target languages: da * OPUS readme: en-da * dataset: opus * model: transformer-align * pre-processing: normalization + SentencePiece * download original weights: opus-2019-12-18.zip * test set translations: opus-2019-12-18.test.txt * test set scores: opus-2019-12-18.eval.txt ## Benchmarks | testset | BLEU | chr-F | |-----------------------|-------|-------| | Tatoeba.en.da | 60.4 | 0.745 |",
+ "model_explanation_gemini": "Translates English (en) to Danish (da) using a transformer-align model with SentencePiece pre-processing. \n\nFeatures: \n- Source language: English (en) \n- Target language: Danish (da) \n- Architecture: transformer-align \n- Pre-processing: normalization + SentencePiece \n- Training dataset: OPUS \n- Benchmarked on Tatoeba with BLEU (60.4) and chr-F (0.745) scores \n\nComparison: \nNo explicit comparison with other"
+}
\ No newline at end of file
diff --git a/model_data_json/HuggingFaceTB_SmolLM2-1.7B-Instruct.json b/model_data_json/HuggingFaceTB_SmolLM2-1.7B-Instruct.json
new file mode 100644
index 0000000000000000000000000000000000000000..4b550ee9de73802a8de98661d11393989293c603
--- /dev/null
+++ b/model_data_json/HuggingFaceTB_SmolLM2-1.7B-Instruct.json
@@ -0,0 +1,25 @@
+{
+ "model_id": "HuggingFaceTB/SmolLM2-1.7B-Instruct",
+ "downloads": 80704,
+ "tags": [
+ "transformers",
+ "tensorboard",
+ "onnx",
+ "safetensors",
+ "llama",
+ "text-generation",
+ "transformers.js",
+ "conversational",
+ "en",
+ "arxiv:2502.02737",
+ "base_model:HuggingFaceTB/SmolLM2-1.7B",
+ "base_model:quantized:HuggingFaceTB/SmolLM2-1.7B",
+ "license:apache-2.0",
+ "autotrain_compatible",
+ "text-generation-inference",
+ "endpoints_compatible",
+ "region:us"
+ ],
+ "description": "--- library_name: transformers license: apache-2.0 language: - en pipeline_tag: text-generation tags: - safetensors - onnx - transformers.js base_model: - HuggingFaceTB/SmolLM2-1.7B --- # SmolLM2 !image/png ## Table of Contents 1. Model Summary 2. Evaluation 3. Examples 4. Limitations 5. Training 6. License 7. Citation ## Model Summary SmolLM2 is a family of compact language models available in three size: 135M, 360M, and 1.7B parameters. They are capable of solving a wide range of tasks while being lightweight enough to run on-device. More details in our paper: The 1.7B variant demonstrates significant advances over its predecessor SmolLM1-1.7B, particularly in instruction following, knowledge, reasoning, and mathematics. It was trained on 11 trillion tokens using a diverse dataset combination: FineWeb-Edu, DCLM, The Stack, along with new mathematics and coding datasets that we curated and will release soon. We developed the instruct version through supervised fine-tuning (SFT) using a combination of public datasets and our own curated datasets. We then applied Direct Preference Optimization (DPO) using UltraFeedback. The instruct model additionally supports tasks such as text rewriting, summarization and function calling thanks to datasets developed by Argilla such as Synth-APIGen-v0.1. You can find the SFT dataset here: For more details refer to: You will find pre-training, post-training, evaluation and local inference code. ### How to use #### Transformers #### Chat in TRL You can also use the TRL CLI to chat with the model from the terminal: #### Transformers.js ## Evaluation In this section, we report the evaluation results of SmolLM2. All evaluations are zero-shot unless stated otherwise, and we use lighteval to run them. ## Base Pre-Trained Model | Metric | SmolLM2-1.7B | Llama-1B | Qwen2.5-1.5B | SmolLM1-1.7B | |------------------|--------------|-------------|---------------|--------------| | HellaSwag | **68.7** | 61.2 | 66.4 | 62.9 | | ARC (Average) | **60.5** | 49.2 | 58.5 | 59.9 | | PIQA | **77.6** | 74.8 | 76.1 | 76.0 | | MMLU-Pro (MCF) | **19.4** | 11.7 | 13.7 | 10.8 | | CommonsenseQA | **43.6** | 41.2 | 34.1 | 38.0 | | TriviaQA | **36.7** | 28.1 | 20.9 | 22.5 | | Winogrande | **59.4** | 57.8 | 59.3 | 54.7 | | OpenBookQA | 42.2 | 38.4 | 40.0 | **42.4** | | GSM8K (5-shot) | 31.0 | 7.2 | **61.3** | 5.5 | ## Instruction Model | Metric | SmolLM2-1.7B-Instruct | Llama-1B-Instruct | Qwen2.5-1.5B-Instruct | SmolLM1-1.7B-Instruct | |:-----------------------------|:---------------------:|:-----------------:|:----------------------:|:----------------------:| | IFEval (Average prompt/inst) | **56.7** | 53.5 | 47.4 | 23.1 | | MT-Bench | 6.13 | 5.48 | **6.52** | 4.33 | | OpenRewrite-Eval (micro_avg RougeL) | 44.9 | 39.2 | **46.9** | NaN | | HellaSwag | **66.1** | 56.1 | 60.9 | 55.5 | | ARC (Average) | **51.7** | 41.6 | 46.2 | 43.7 | | PIQA | **74.4** | 72.3 | 73.2 | 71.6 | | MMLU-Pro (MCF) | 19.3 | 12.7 | **24.2** | 11.7 | | BBH (3-shot) | 32.2 | 27.6 | **35.3** | 25.7 | | GSM8K (5-shot) | **48.2** | 26.8 | 42.8 | 4.62 | ## Examples Below are some system and instruct prompts that work well for special tasks ### Text rewriting ### Summarization ### Function calling SmolLM2-1.7B-Instruct can handle function calling, it scores 27% on the BFCL Leaderboard. Here's how you can leverage it: More details such as parallel function calls and tools not available can be found here ## Limitations SmolLM2 models primarily understand and generate content in English. They can produce text on a variety of topics, but the generated content may not always be factually accurate, logically consistent, or free from biases present in the training data. These models should be used as assistive tools rather than definitive sources of information. Users should always verify important information and critically evaluate any generated content. ## Training ### Model - **Architecture:** Transformer decoder - **Pretraining tokens:** 11T - **Precision:** bfloat16 ### Hardware - **GPUs:** 256 H100 ### Software - **Training Framework:** nanotron - **Alignment Handbook** alignment-handbook ## License Apache 2.0 ## Citation",
+ "model_explanation_gemini": "A compact 1.7B-parameter instruction-tuned language model optimized for on-device use, excelling in tasks like text rewriting, summarization, function calling, and reasoning while outperforming similar-sized models like Llama-1B and SmolLM1-1.7B in instruction following and knowledge tasks. \n\n**Features:** \n- Lightweight design for on-device deployment \n- Instruction fine-tuning (SFT + DPO) for improved task performance \n- Supports text"
+}
\ No newline at end of file
diff --git a/model_data_json/HuggingFaceTB_SmolVLM-Instruct.json b/model_data_json/HuggingFaceTB_SmolVLM-Instruct.json
new file mode 100644
index 0000000000000000000000000000000000000000..6a0a437bac0147e5794c2ad1a78a0f611f11fe21
--- /dev/null
+++ b/model_data_json/HuggingFaceTB_SmolVLM-Instruct.json
@@ -0,0 +1,23 @@
+{
+ "model_id": "HuggingFaceTB/SmolVLM-Instruct",
+ "downloads": 72887,
+ "tags": [
+ "transformers",
+ "onnx",
+ "safetensors",
+ "idefics3",
+ "image-text-to-text",
+ "conversational",
+ "en",
+ "dataset:HuggingFaceM4/the_cauldron",
+ "dataset:HuggingFaceM4/Docmatix",
+ "arxiv:2504.05299",
+ "base_model:HuggingFaceTB/SmolLM2-1.7B-Instruct",
+ "base_model:quantized:HuggingFaceTB/SmolLM2-1.7B-Instruct",
+ "license:apache-2.0",
+ "endpoints_compatible",
+ "region:us"
+ ],
+ "description": "--- library_name: transformers license: apache-2.0 datasets: - HuggingFaceM4/the_cauldron - HuggingFaceM4/Docmatix pipeline_tag: image-text-to-text language: - en base_model: - HuggingFaceTB/SmolLM2-1.7B-Instruct - google/siglip-so400m-patch14-384 --- # SmolVLM SmolVLM is a compact open multimodal model that accepts arbitrary sequences of image and text inputs to produce text outputs. Designed for efficiency, SmolVLM can answer questions about images, describe visual content, create stories grounded on multiple images, or function as a pure language model without visual inputs. Its lightweight architecture makes it suitable for on-device applications while maintaining strong performance on multimodal tasks. ## Model Summary - **Developed by:** Hugging Face 🤗 - **Model type:** Multi-modal model (image+text) - **Language(s) (NLP):** English - **License:** Apache 2.0 - **Architecture:** Based on Idefics3 (see technical summary) ## Resources - **Demo:** SmolVLM Demo - **Blog:** Blog post ## Uses SmolVLM can be used for inference on multimodal (image + text) tasks where the input comprises text queries along with one or more images. Text and images can be interleaved arbitrarily, enabling tasks like image captioning, visual question answering, and storytelling based on visual content. The model does not support image generation. To fine-tune SmolVLM on a specific task, you can follow the fine-tuning tutorial. ### Technical Summary SmolVLM leverages the lightweight SmolLM2 language model to provide a compact yet powerful multimodal experience. It introduces several changes compared to previous Idefics models: - **Image compression:** We introduce a more radical image compression compared to Idefics3 to enable the model to infer faster and use less RAM. - **Visual Token Encoding:** SmolVLM uses 81 visual tokens to encode image patches of size 384×384. Larger images are divided into patches, each encoded separately, enhancing efficiency without compromising performance. More details about the training and architecture are available in our technical report. ### How to get started You can use transformers to load, infer and fine-tune SmolVLM. ### Model optimizations **Precision**: For better performance, load and run the model in half-precision ( or ) if your hardware supports it. You can also load SmolVLM with 4/8-bit quantization using bitsandbytes, torchao or Quanto. Refer to this page for other options. **Vision Encoder Efficiency**: Adjust the image resolution by setting when initializing the processor, where N is your desired value. The default works well, which results in input images of size 1536×1536. For documents, might be beneficial. Decreasing N can save GPU memory and is appropriate for lower-resolution images. This is also useful if you want to fine-tune on videos. ## Misuse and Out-of-scope Use SmolVLM is not intended for high-stakes scenarios or critical decision-making processes that affect an individual's well-being or livelihood. The model may produce content that appears factual but may not be accurate. Misuse includes, but is not limited to: - Prohibited Uses: - Evaluating or scoring individuals (e.g., in employment, education, credit) - Critical automated decision-making - Generating unreliable factual content - Malicious Activities: - Spam generation - Disinformation campaigns - Harassment or abuse - Unauthorized surveillance ### License SmolVLM is built upon the shape-optimized SigLIP as image encoder and SmolLM2 for text decoder part. We release the SmolVLM checkpoints under the Apache 2.0 license. ## Training Details ### Training Data The training data comes from The Cauldron and Docmatix datasets, with emphasis on document understanding (25%) and image captioning (18%), while maintaining balanced coverage across other crucial capabilities like visual reasoning, chart comprehension, and general instruction following. ## Evaluation | Model | MMMU (val) | MathVista (testmini) | MMStar (val) | DocVQA (test) | TextVQA (val) | Min GPU RAM required (GB) | |-------------------|------------|----------------------|--------------|---------------|---------------|---------------------------| | SmolVLM | 38.8 | 44.6 | 42.1 | 81.6 | 72.7 | 5.02 | | Qwen-VL 2B | 41.1 | 47.8 | 47.5 | 90.1 | 79.7 | 13.70 | | InternVL2 2B | 34.3 | 46.3 | 49.8 | 86.9 | 73.4 | 10.52 | | PaliGemma 3B 448px| 34.9 | 28.7 | 48.3 | 32.2 | 56.0 | 6.72 | | moondream2 | 32.4 | 24.3 | 40.3 | 70.5 | 65.2 | 3.87 | | MiniCPM-V-2 | 38.2 | 39.8 | 39.1 | 71.9 | 74.1 | 7.88 | | MM1.5 1B | 35.8 | 37.2 | 0.0 | 81.0 | 72.5 | NaN | # Citation information You can cite us in the following way:",
+ "model_explanation_gemini": "\"SmolVLM is a compact multimodal model that processes interleaved image and text inputs to generate text outputs, excelling in tasks like visual question answering, image captioning, and storytelling while optimized for efficiency and on-device use.\"\n\n**Model Features:** \n1. Multimodal (image + text) input with arbitrary interleaving \n2. Text-only capability (functions as language model) \n3. Lightweight architecture for on-device applications \n4. 81 visual tokens"
+}
\ No newline at end of file
diff --git a/model_data_json/HuggingFaceTB_SmolVLM2-2.2B-Instruct.json b/model_data_json/HuggingFaceTB_SmolVLM2-2.2B-Instruct.json
new file mode 100644
index 0000000000000000000000000000000000000000..bdcea01e30a4fc3d80f485a481b1c2c8120ae146
--- /dev/null
+++ b/model_data_json/HuggingFaceTB_SmolVLM2-2.2B-Instruct.json
@@ -0,0 +1,33 @@
+{
+ "model_id": "HuggingFaceTB/SmolVLM2-2.2B-Instruct",
+ "downloads": 74571,
+ "tags": [
+ "transformers",
+ "safetensors",
+ "smolvlm",
+ "image-text-to-text",
+ "video-text-to-text",
+ "conversational",
+ "en",
+ "dataset:HuggingFaceM4/the_cauldron",
+ "dataset:HuggingFaceM4/Docmatix",
+ "dataset:lmms-lab/LLaVA-OneVision-Data",
+ "dataset:lmms-lab/M4-Instruct-Data",
+ "dataset:HuggingFaceFV/finevideo",
+ "dataset:MAmmoTH-VL/MAmmoTH-VL-Instruct-12M",
+ "dataset:lmms-lab/LLaVA-Video-178K",
+ "dataset:orrzohar/Video-STaR",
+ "dataset:Mutonix/Vript",
+ "dataset:TIGER-Lab/VISTA-400K",
+ "dataset:Enxin/MovieChat-1K_train",
+ "dataset:ShareGPT4Video/ShareGPT4Video",
+ "arxiv:2504.05299",
+ "base_model:HuggingFaceTB/SmolVLM-Instruct",
+ "base_model:finetune:HuggingFaceTB/SmolVLM-Instruct",
+ "license:apache-2.0",
+ "endpoints_compatible",
+ "region:us"
+ ],
+ "description": "--- library_name: transformers license: apache-2.0 datasets: - HuggingFaceM4/the_cauldron - HuggingFaceM4/Docmatix - lmms-lab/LLaVA-OneVision-Data - lmms-lab/M4-Instruct-Data - HuggingFaceFV/finevideo - MAmmoTH-VL/MAmmoTH-VL-Instruct-12M - lmms-lab/LLaVA-Video-178K - orrzohar/Video-STaR - Mutonix/Vript - TIGER-Lab/VISTA-400K - Enxin/MovieChat-1K_train - ShareGPT4Video/ShareGPT4Video pipeline_tag: image-text-to-text tags: - video-text-to-text language: - en base_model: - HuggingFaceTB/SmolVLM-Instruct --- # SmolVLM2 2.2B SmolVLM2-2.2B is a lightweight multimodal model designed to analyze video content. The model processes videos, images, and text inputs to generate text outputs - whether answering questions about media files, comparing visual content, or transcribing text from images. Despite its compact size, requiring only 5.2GB of GPU RAM for video inference, it delivers robust performance on complex multimodal tasks. This efficiency makes it particularly well-suited for on-device applications where computational resources may be limited. ## Model Summary - **Developed by:** Hugging Face 🤗 - **Model type:** Multi-modal model (image/multi-image/video/text) - **Language(s) (NLP):** English - **License:** Apache 2.0 - **Architecture:** Based on Idefics3 (see technical summary) ## Resources - **Demo:** Video Highlight Generator - **Blog:** Blog post ## Uses SmolVLM2 can be used for inference on multimodal (video / image / text) tasks where the input consists of text queries along with video or one or more images. Text and media files can be interleaved arbitrarily, enabling tasks like captioning, visual question answering, and storytelling based on visual content. The model does not support image or video generation. To fine-tune SmolVLM2 on a specific task, you can follow the fine-tuning tutorial. ## Evaluation ### Vision Evaluation | Model | Mathvista | MMMU | OCRBench | MMStar | AI2D | ChartQA_Test | Science_QA | TextVQA Val | DocVQA Val | |-------------------|-----------|-------|----------|--------|------|--------------|------------|-------------|------------| | **SmolVLM2 2.2B** | 51.5 | 42 | 72.9 | 46 | 70 | 68.84 | 90 | 73.21 | 79.98 | | SmolVLM 2.2B | 43.9 | 38.3 | 65.5 | 41.8 | 84.5 | 71.6 | 84.5 | 72.1 | 79.7 | ### Video Evaluation We evaluated the performance of the SmolVLM2 family on the following scientific benchmarks: | Size | Video-MME | MLVU | MVBench | |----------|-----------------|----------|---------------| | 2.2B | 52.1 | 55.2 | 46.27 | | 500M | 42.2 | 47.3 | 39.73 | | 256M | 33.7 | 40.6 | 32.7 | ### How to get started You can use transformers to load, infer and fine-tune SmolVLM. Make sure you have num2words, flash-attn and latest transformers installed. You can load the model as follows. #### Simple Inference You preprocess your inputs directly using chat templates and directly passing them #### Video Inference To use SmolVLM2 for video inference, make sure you have decord installed. #### Multi-image Interleaved Inference You can interleave multiple media with text using chat templates. ### Model optimizations ## Misuse and Out-of-scope Use SmolVLM is not intended for high-stakes scenarios or critical decision-making processes that affect an individual's well-being or livelihood. The model may produce content that appears factual but may not be accurate. Misuse includes, but is not limited to: - Prohibited Uses: - Evaluating or scoring individuals (e.g., in employment, education, credit) - Critical automated decision-making - Generating unreliable factual content - Malicious Activities: - Spam generation - Disinformation campaigns - Harassment or abuse - Unauthorized surveillance ### License SmolVLM2 is built upon the shape-optimized SigLIP as image encoder and SmolLM2 for text decoder part. We release the SmolVLM2 checkpoints under the Apache 2.0 license. ## Citation information You can cite us in the following way: ## Training Data SmolVLM2 used 3.3M samples for training originally from ten different datasets: LlaVa Onevision, M4-Instruct, Mammoth, LlaVa Video 178K, FineVideo, VideoStar, VRipt, Vista-400K, MovieChat and ShareGPT4Video. In the following plots we give a general overview of the samples across modalities and the source of those samples. ## Data Split per modality | Data Type | Percentage | |--------------|------------| | Image | 34.4% | | Text | 20.2% | | Video | 33.0% | | Multi-image | 12.3% | ## Granular dataset slices per modality ### Text Datasets | Dataset | Percentage | |--------------------------------------------|------------| | llava-onevision/magpie_pro_ft3_80b_mt | 6.8% | | llava-onevision/magpie_pro_ft3_80b_tt | 6.8% | | llava-onevision/magpie_pro_qwen2_72b_tt | 5.8% | | llava-onevision/mathqa | 0.9% | ### Multi-image Datasets | Dataset | Percentage | |--------------------------------------------|------------| | m4-instruct-data/m4_instruct_multiimage | 10.4% | | mammoth/multiimage-cap6 | 1.9% | ### Image Datasets | Dataset | Percentage | |--------------------------------------------|------------| | llava-onevision/other | 17.4% | | llava-onevision/vision_flan | 3.9% | | llava-onevision/mavis_math_metagen | 2.6% | | llava-onevision/mavis_math_rule_geo | 2.5% | | llava-onevision/sharegpt4o | 1.7% | | llava-onevision/sharegpt4v_coco | 1.5% | | llava-onevision/image_textualization | 1.3% | | llava-onevision/sharegpt4v_llava | 0.9% | | llava-onevision/mapqa | 0.9% | | llava-onevision/qa | 0.8% | | llava-onevision/textocr | 0.8% | ### Video Datasets | Dataset | Percentage | |--------------------------------------------|------------| | llava-video-178k/1-2m | 7.3% | | llava-video-178k/2-3m | 7.0% | | other-video/combined | 5.7% | | llava-video-178k/hound | 4.4% | | llava-video-178k/0-30s | 2.4% | | video-star/starb | 2.2% | | vista-400k/combined | 2.2% | | vript/long | 1.0% | | ShareGPT4Video/all | 0.8% |",
+ "model_explanation_gemini": "A lightweight multimodal model analyzing video, image, and text inputs to generate text outputs for tasks like visual question answering, captioning, and transcription, optimized for efficiency in resource-limited environments. \n\n**Features:** \n- Multimodal (video/image/text input, text output) \n- Tasks: captioning, visual QA, storytelling, text transcription \n- Compact size (5.2GB GPU RAM for video inference) \n- No image/video generation capability \n- Apache 2.0 license"
+}
\ No newline at end of file
diff --git a/model_data_json/KBLab_kb-whisper-small.json b/model_data_json/KBLab_kb-whisper-small.json
new file mode 100644
index 0000000000000000000000000000000000000000..c47043b3851c7ee8dd3fcf6ce826ffc228b57959
--- /dev/null
+++ b/model_data_json/KBLab_kb-whisper-small.json
@@ -0,0 +1,21 @@
+{
+ "model_id": "KBLab/kb-whisper-small",
+ "downloads": 28474,
+ "tags": [
+ "transformers",
+ "onnx",
+ "safetensors",
+ "whisper",
+ "automatic-speech-recognition",
+ "ctranslate2",
+ "sv",
+ "dataset:KBLab/rixvox-v2",
+ "base_model:openai/whisper-small",
+ "base_model:quantized:openai/whisper-small",
+ "license:apache-2.0",
+ "endpoints_compatible",
+ "region:us"
+ ],
+ "description": "--- library_name: transformers base_model: openai/whisper-small language: - sv pipeline_tag: automatic-speech-recognition license: apache-2.0 datasets: - KBLab/rixvox-v2 tags: - ctranslate2 --- ## KB-Whisper Small The National Library of Sweden releases a new suite of Whisper models trained on over 50,000 hours of Swedish speech. In evaluations across FLEURS, CommonVoice and NST, our best performing model reduces the Word Error Rate (WER) by an average of 47% compared to OpenAI's . The performance of smaller Whisper model sizes on Swedish speech has also substantially improved, with outperforming (a model six times its size). | Model size | | FLEURS | CommonVoice | NST | |------------|---------|--------|-------------|------| | tiny | **KBLab** | **13.2** | **12.9** | **11.2** | | | OpenAI | 59.2 | 67.8 | 85.2 | | base | **KBLab** | **9.1** | **8.7** | **7.8** | | | OpenAI | 39.6 | 52.1 | 53.4 | | small | **KBLab** | **7.3** | **6.4** | **6.6** | | | OpenAI | 20.6 | 26.4 | 26.4 | | medium | **KBLab** | **6.6** | **5.4** | **5.8** | | | OpenAI | 12.1 | 15.8 | 17.1 | | large-v3 | **KBLab** | **5.4** | **4.1** | **5.2** | | | OpenAI | 7.8 | 9.5 | 11.3 | Table: **Word Error Rate (WER)** comparison between KBLab's Whisper models and the corresponding OpenAI versions. ### Usage We provide checkpoints in different formats: , (GGML), , and (used in and ). #### Hugging Face Inference example for using with Hugging Face: #### Faster-whisper Faster-whisper provides fast and efficient inference via a reimplementation of Whisper using . #### WhisperX WhisperX provides a convenient method of getting accurate word level timestamps. The library combines (force aligns) the text output of Whisper with the accurate timestamps of Wav2vec2. We provide an example below of how to use together with KBLab/wav2vec2-large-voxrex-swedish. #### Whisper.cpp / GGML We provide GGML checkpoints used in the apps and . To use our model with first clone the repository and build the library: To use the model you need to download one of the GGML checkpoints we have uploaded. You can either press the download buttons here, or download using : Run inference by specifying the model path after the argument , along with the path to the audio file as the last positional argument. #### onnx (optimum) and transformers.js usage You can use the checkpoints via Hugging Face's library in the following manner: An example of an app that runs inference locally in the browser with and can be found at (created by Pierre Mesure). A template for setting up such an app with javascript can be found at ### Training data Our models have been trained on over 50,000 hours of Swedish audio with text transcriptions. The models were trained in 2 stages, each characterized by the application of different quality filters and thresholds for said filters. Stage 1 employed low threshold values (0 to 0.30 BLEU depending on dataset), whereas Stage 2 used stricter thresholds (, weighted ROUGE-N , CER of first and last 10 characters ). | Dataset | Continued pretraining (h) -- Stage 1 | Finetuning (h) -- Stage 2 | |-------------|--------------------------|--------------| | Subtitles | 34,261 | 3,110 | | Riksdag | 21,949 | 5,119 | | ISOF | 54 | 54 | | NST | 250 | 250 | | **Total** | **56,514** | **8,533** | The default when loading our models through Hugging Face is **Stage 2**. We have however also uploaded continued pretraining checkpoints and tagged them. You can load these other checkpoints by specifying the in . The pretrained checkpoints tag can for example be found here: []( The Stage 2 default model tag is named . We supply a different stage 2 checkpoint -- with a more condensed style of transcribing -- under the name . ### Evaluation #### WER | Model size | | FLEURS | CommonVoice | NST | |------------|---------|--------|-------------|------| | tiny | **KBLab** | **13.2** | **12.9** | **11.2** | | | OpenAI | 59.2 | 67.8 | 85.2 | | base | **KBLab** | **9.1** | **8.7** | **7.8** | | | OpenAI | 39.6 | 52.1 | 53.4 | | small | **KBLab** | **7.3** | **6.4** | **6.6** | | | OpenAI | 20.6 | 26.4 | 26.4 | | medium | **KBLab** | **6.6** | **5.4** | **5.8** | | | OpenAI | 12.1 | 15.8 | 17.1 | | large-v3 | **KBLab** | **5.4** | **4.1** | **5.2** | | | OpenAI | 7.8 | 9.5 | 11.3 | #### BLEU Score | Model size | | FLEURS | CommonVoice | NST | |------------|---------|--------|-------------|------| | tiny | KBLab | **76.6** | **73.7** | **74.3** | | | OpenAI | 26.9 | 21.1 | 24.0 | | base | KBLab | **83.2** | **79.9** | **78.3** | | | OpenAI | 41.1 | 32.5 | 36.9 | | small | KBLab | **86.6** | **83.5** | **79.6** | | | OpenAI | 64.0 | 56.5 | 58.2 | | medium | KBLab | **87.6** | **85.0** | **80.2** | | | OpenAI | 77.1 | 70.1 | 68.9 | | large-v3 | KBLab | **89.8** | **87.2** | **81.1** | | | OpenAI | 84.9 | 79.1 | 75.1 | ### Acknowledgements We acknowledge the EuroHPC Joint Undertaking for awarding this project access to the EuroHPC supercomputer LEONARDO, hosted by CINECA (Italy) and the LEONARDO consortium through an EuroHPC AI and Data-Intensive Applications Access call. ### Citation Paper reference coming soon.",
+ "model_explanation_gemini": "A Swedish-optimized automatic speech recognition model trained on 50,000+ hours of Swedish speech, significantly reducing word error rates compared to OpenAI's Whisper models. \n\n**Features**: \n- Optimized for Swedish speech recognition \n- Trained on 50,000+ hours of Swedish audio \n- Lower Word Error Rate (WER) than OpenAI's Whisper models (47% average improvement) \n- Available in multiple formats (CT2, GGML, ONNX, Transformers"
+}
\ No newline at end of file
diff --git a/model_data_json/LGAI-EXAONE_EXAONE-3.0-7.8B-Instruct.json b/model_data_json/LGAI-EXAONE_EXAONE-3.0-7.8B-Instruct.json
new file mode 100644
index 0000000000000000000000000000000000000000..14777a0cdebe90155ecb18fd1a0b61a38919b2c3
--- /dev/null
+++ b/model_data_json/LGAI-EXAONE_EXAONE-3.0-7.8B-Instruct.json
@@ -0,0 +1,21 @@
+{
+ "model_id": "LGAI-EXAONE/EXAONE-3.0-7.8B-Instruct",
+ "downloads": 69297,
+ "tags": [
+ "transformers",
+ "safetensors",
+ "exaone",
+ "text-generation",
+ "lg-ai",
+ "conversational",
+ "custom_code",
+ "en",
+ "ko",
+ "arxiv:2408.03541",
+ "license:other",
+ "autotrain_compatible",
+ "region:us"
+ ],
+ "description": "--- license: other license_name: exaone license_link: LICENSE language: - en - ko tags: - lg-ai - exaone ---
# EXAONE-3.0-7.8B-Instruct **👋👋 We have revised our license for revitalizing the research ecosystem.👋👋** ## Introduction We introduce EXAONE-3.0-7.8B-Instruct, a pre-trained and instruction-tuned bilingual (English and Korean) generative model with 7.8 billion parameters. The model was pre-trained with 8T curated tokens and post-trained with supervised fine-tuning and direct preference optimization. It demonstrates highly competitive benchmark performance against other state-of-the-art open models of similar size. For more details, please refer to our technical report, blog and GitHub. ## Quickstart We recommend to use transformers v4.41 or later. > ### Note > The EXAONE 3.0 instruction-tuned language model was trained to utilize the system prompt, > so we highly recommend using the system prompts provided in the code snippet above. ## Evaluation We compared EXAONE-3.0-7.8B-Instruct with similar-sized instruction-tuned LLMs. To verify the performance of real-world use cases, we measured benchmarks that have a high correlation with LMSYS Chatbot Arena. Some experimental results are shown below. The full evaluation results can be found in the technical report. | Language | Benchmark | EXAONE 3.0 7.8B Inst. | Llama 3.1 8B Inst. | Gemma 2 9B Inst. | QWEN 2 7B Inst. | Phi 3 7B Inst. | Mistral 7B Inst. | | :-----: | :----- | :-----: | :-----: | :-----: | :-----: | :-----: | :-----: | | English | MT-Bench | **9.01** | 7.95 | 8.52 | 8.41 | 8.52 | 7.72 | | | Arena-Hard-v0.1 | **46.8** | 28.0 | 42.1 | 21.7 | 29.1 | 16.2 | | | WildBench | **48.2** | 34.5 | 41.5 | 34.9 | 32.8 | 29.0 | | | AlpacaEval 2.0 LC | 45.0 | 31.5 | **47.5** | 24.5 | 37.1 | 31.0 | | Korean | KoMT-Bench[1] | **8.92** | 6.06 | 7.92 | 7.69 | 4.87 | 5.20 | | | LogicKor | **8.62** | 5.40 | 8.07 | 6.12 | 3.76 | 3.42 | - [1] KoMT-Bench is a dataset created by translating MT-Bench into Korean; see README for more details. ## Limitation The EXAONE language model has certain limitations and may occasionally generate inappropriate responses. The language model generates responses based on the output probability of tokens, and it is determined during learning from training data. While we have made every effort to exclude personal, harmful, and biased information from the training data, some problematic content may still be included, potentially leading to undesirable responses. Please note that the text generated by EXAONE language model does not reflects the views of LG AI Research. - Inappropriate answers may be generated, which contain personal, harmful or other inappropriate information. - Biased responses may be generated, which are associated with age, gender, race, and so on. - The generated responses rely heavily on statistics from the training data, which can result in the generation of semantically or syntactically incorrect sentences. - Since the model does not reflect the latest information, the responses may be false or contradictory. LG AI Research strives to reduce potential risks that may arise from EXAONE language model. Users are not allowed to engage in any malicious activities (e.g., keying in illegal information) that may induce the creation of inappropriate outputs violating LG AI’s ethical principles when using EXAONE language model. ## License The model is licensed under EXAONE AI Model License Agreement 1.1 - NC ## Citation ## Contact LG AI Research Technical Support: contact_us@lgresearch.ai",
+ "model_explanation_gemini": "A 7.8B-parameter bilingual (English and Korean) instruction-tuned generative model optimized for competitive performance in benchmarks, trained with supervised fine-tuning and direct preference optimization. \n\n**Features:** \n- Bilingual (English and Korean) \n- 7.8 billion parameters \n- Pre-trained on 8T tokens \n- Instruction-tuned with supervised fine-tuning and DPO \n- Competitive benchmark performance against similar-sized models \n- Requires system prompts for optimal use \n\n**Comparison:**"
+}
\ No newline at end of file
diff --git a/model_data_json/LorenzoDeMattei_GePpeTto.json b/model_data_json/LorenzoDeMattei_GePpeTto.json
new file mode 100644
index 0000000000000000000000000000000000000000..9ff336fc1baa575dbe48eae3ad009228a9cdb409
--- /dev/null
+++ b/model_data_json/LorenzoDeMattei_GePpeTto.json
@@ -0,0 +1,20 @@
+{
+ "model_id": "LorenzoDeMattei/GePpeTto",
+ "downloads": 78170,
+ "tags": [
+ "transformers",
+ "pytorch",
+ "jax",
+ "safetensors",
+ "gpt2",
+ "text-generation",
+ "it",
+ "arxiv:2004.14253",
+ "autotrain_compatible",
+ "text-generation-inference",
+ "endpoints_compatible",
+ "region:us"
+ ],
+ "description": "--- language: it --- # GePpeTto GPT2 Model 🇮🇹 Pretrained GPT2 117M model for Italian. You can find further details in the paper: Lorenzo De Mattei, Michele Cafagna, Felice Dell’Orletta, Malvina Nissim, Marco Guerini \"GePpeTto Carves Italian into a Language Model\", arXiv preprint. Pdf available at: ## Pretraining Corpus The pretraining set comprises two main sources. The first one is a dump of Italian Wikipedia (November 2019), consisting of 2.8GB of text. The second one is the ItWac corpus (Baroni et al., 2009), which amounts to 11GB of web texts. This collection provides a mix of standard and less standard Italian, on a rather wide chronological span, with older texts than the Wikipedia dump (the latter stretches only to the late 2000s). ## Pretraining details This model was trained using GPT2's Hugging Face implemenation on 4 NVIDIA Tesla T4 GPU for 620k steps. Training parameters: - GPT-2 small configuration - vocabulary size: 30k - Batch size: 32 - Block size: 100 - Adam Optimizer - Initial learning rate: 5e-5 - Warm up steps: 10k ## Perplexity scores | Domain | Perplexity | |---|---| | Wikipedia | 26.1052 | | ItWac | 30.3965 | | Legal | 37.2197 | | News | 45.3859 | | Social Media | 84.6408 | For further details, qualitative analysis and human evaluation check out: ## Load Pretrained Model You can use this model by installing Huggingface library . And you can use it directly by initializing it like this: ## Example using GPT2LMHeadModel Output is, ## Citation Please use the following bibtex entry: ## References Marco Baroni, Silvia Bernardini, Adriano Ferraresi, and Eros Zanchetta. 2009. The WaCky wide web: a collection of very large linguistically processed webcrawled corpora. Language resources and evaluation, 43(3):209–226.",
+ "model_explanation_gemini": "\"GePpeTto is a 117M-parameter GPT-2 model pretrained on Italian text from Wikipedia and ItWac for generating and processing Italian language.\"\n\n**Model Features:** \n- Language: Italian (🇮🇹) \n- Architecture: GPT-2 small (117M parameters) \n- Pretraining Corpus: Italian Wikipedia (2.8GB) + ItWac (11GB) \n- Training: 620k steps on 4× Tesla T4"
+}
\ No newline at end of file
diff --git a/model_data_json/M-CLIP_XLM-Roberta-Large-Vit-B-16Plus.json b/model_data_json/M-CLIP_XLM-Roberta-Large-Vit-B-16Plus.json
new file mode 100644
index 0000000000000000000000000000000000000000..eafad916ccbeff94d8d67df661287388568b8e9a
--- /dev/null
+++ b/model_data_json/M-CLIP_XLM-Roberta-Large-Vit-B-16Plus.json
@@ -0,0 +1,61 @@
+{
+ "model_id": "M-CLIP/XLM-Roberta-Large-Vit-B-16Plus",
+ "downloads": 83225,
+ "tags": [
+ "transformers",
+ "pytorch",
+ "tf",
+ "multilingual",
+ "af",
+ "sq",
+ "am",
+ "ar",
+ "az",
+ "bn",
+ "bs",
+ "bg",
+ "ca",
+ "zh",
+ "hr",
+ "cs",
+ "da",
+ "nl",
+ "en",
+ "et",
+ "fr",
+ "de",
+ "el",
+ "hi",
+ "hu",
+ "is",
+ "id",
+ "it",
+ "ja",
+ "mk",
+ "ml",
+ "mr",
+ "pl",
+ "pt",
+ "ro",
+ "ru",
+ "sr",
+ "sl",
+ "es",
+ "sw",
+ "sv",
+ "tl",
+ "te",
+ "tr",
+ "tk",
+ "uk",
+ "ur",
+ "ug",
+ "uz",
+ "vi",
+ "xh",
+ "endpoints_compatible",
+ "region:us"
+ ],
+ "description": "--- language: - multilingual - af - sq - am - ar - az - bn - bs - bg - ca - zh - hr - cs - da - nl - en - et - fr - de - el - hi - hu - is - id - it - ja - mk - ml - mr - pl - pt - ro - ru - sr - sl - es - sw - sv - tl - te - tr - tk - uk - ur - ug - uz - vi - xh --- ## Multilingual-clip: XLM-Roberta-Large-Vit-B-16Plus Multilingual-CLIP extends OpenAI's English text encoders to multiple other languages. This model *only* contains the multilingual text encoder. The corresponding image model can be retrieved via instructions found on open_clip repository on Github. We provide a usage example below. ## Requirements To use both the multilingual text encoder and corresponding image encoder, we need to install the packages []( and []( ## Usage Extracting embeddings from the text encoder can be done in the following way: Extracting embeddings from the corresponding image encoder: ## Evaluation results None of the M-CLIP models have been extensivly evaluated, but testing them on Txt2Img retrieval on the humanly translated MS-COCO dataset, we see the following **R@10** results: | Name | En | De | Es | Fr | Zh | It | Pl | Ko | Ru | Tr | Jp | | ----------------------------------|:-----: |:-----: |:-----: |:-----: | :-----: |:-----: |:-----: |:-----: |:-----: |:-----: |:-----: | | OpenAI CLIP Vit-B/32| 90.3 | - | - | - | - | - | - | - | - | - | - | | OpenAI CLIP Vit-L/14| 91.8 | - | - | - | - | - | - | - | - | - | - | | OpenCLIP ViT-B-16+-| 94.3 | - | - | - | - | - | - | - | - | - | - | | LABSE Vit-L/14| 91.6 | 89.6 | 89.5 | 89.9 | 88.9 | 90.1 | 89.8 | 80.8 | 85.5 | 89.8 | 73.9 | | XLM-R Large Vit-B/32| 91.8 | 88.7 | 89.1 | 89.4 | 89.3 | 89.8| 91.4 | 82.1 | 86.1 | 88.8 | 81.0 | | XLM-R Vit-L/14| 92.4 | 90.6 | 91.0 | 90.0 | 89.7 | 91.1 | 91.3 | 85.2 | 85.8 | 90.3 | 81.9 | | XLM-R Large Vit-B/16+| **95.0** | **93.0** | **93.6** | **93.1** | **94.0** | **93.1** | **94.4** | **89.0** | **90.0** | **93.0** | **84.2** | ## Training/Model details Further details about the model training and data can be found in the model card.",
+ "model_explanation_gemini": "Extends OpenAI's CLIP text encoder to multiple languages for generating multilingual text embeddings that pair with an image encoder for cross-modal tasks like text-to-image retrieval.\n\nModel Features: \n- Multilingual text encoder (supports 50+ languages) \n- Based on XLM-Roberta-Large architecture \n- Paired with ViT-B-16Plus image encoder (separately available) \n- Designed for cross-modal retrieval tasks \n\nComparison: \nThe XLM-R Large Vit-B/16"
+}
\ No newline at end of file
diff --git a/model_data_json/MahmoodLab_TITAN.json b/model_data_json/MahmoodLab_TITAN.json
new file mode 100644
index 0000000000000000000000000000000000000000..f384432848aa36c5a82eae4d9b58b26121beca94
--- /dev/null
+++ b/model_data_json/MahmoodLab_TITAN.json
@@ -0,0 +1,22 @@
+{
+ "model_id": "MahmoodLab/TITAN",
+ "downloads": 239360,
+ "tags": [
+ "safetensors",
+ "titan",
+ "histology",
+ "pathology",
+ "vision",
+ "pytorch",
+ "self-supervised",
+ "vit",
+ "image-feature-extraction",
+ "custom_code",
+ "en",
+ "arxiv:2411.19666",
+ "license:cc-by-nc-nd-4.0",
+ "region:us"
+ ],
+ "description": "--- license: cc-by-nc-nd-4.0 language: - en tags: - histology - pathology - vision - pytorch - self-supervised - vit extra_gated_prompt: >- This model and associated code are released under the CC-BY-NC-ND 4.0 license and may only be used for non-commercial, academic research purposes with proper attribution. Any commercial use, sale, or other monetization of the TITAN model and its derivatives, which include models trained on outputs from the TITAN model or datasets created from the TITAN model, is prohibited and requires prior approval. Please note that the primary email used to sign up for your Hugging Face account must match your institutional email to receive approval. By downloading the model, you attest that all information (affiliation, research use) is correct and up-to-date. Downloading the model requires prior registration on Hugging Face and agreeing to the terms of use. By downloading this model, you agree not to distribute, publish or reproduce a copy of the model. If another user within your organization wishes to use the TITAN model, they must register as an individual user and agree to comply with the terms of use. Users may not attempt to re-identify the deidentified data used to develop the underlying model. If you are a commercial entity, please contact the corresponding author. extra_gated_fields: Full name (first and last): text Current affiliation (no abbreviations): text Type of Affiliation: type: select options: - Academia - Industry - label: Other value: other Current and official institutional email (**this must match your primary email in your Hugging Face account, @gmail/@hotmail/@qq email domains will be denied**): text Please explain your intended research use: text I agree to all terms outlined above: checkbox I agree to use this model for non-commercial, academic purposes only: checkbox I agree not to distribute the model, if another user within your organization wishes to use the TITAN model, they must register as an individual user: checkbox metrics: - accuracy pipeline_tag: image-feature-extraction --- # Model Card for TITAN-preview \\[Preprint\\] | \\[Github Repo\\] | \\[Cite\\] ## What is TITAN? **TITAN** (**T**ransformer-based pathology **I**mage and **T**ext **A**lignment **N**etwork) is a multimodal whole-slide foundation model pre-trained using visual self-supervised learning and vision-language alignment. It leverages 335,645 whole-slide images (WSIs) from a diverse set of internally collected neoplastic, infectious, and inflammatory cases at Mass General Brigham. Additionally, TITAN utilizes over 182,000 pathology reports and more than 423,000 synthetic captions generated by PathChat, our pathology co-pilot. TITAN's slide embeddings achieve state-of-the-art performance on diverse downstream tasks, including linear probing, few-shot and zero-shot classification, rare cancer retrieval, cross-modal retrieval, and pathology report generation. This is a preview and we will bring you further updates and improvements. **your request will be denied**. To fix this, you can: (1) add your official institutional email to your HF account, and confirm your email address to verify, and (2) set your institutional email as your primary email in your HF account. Other reasons for your request access being denied include other mistakes in the form submitted, for example: full name includes abbreviations, affiliation is not spelled out, the described research use is not sufficient, or email domain address not recognized. ## Model Description - **Developed by:** Mahmood Lab AI for Pathology @ Harvard/BWH - **Model type:** Pretrained vision-language encoders - **Pretraining dataset:** Mass-340K, sourced from private histology collections (BWH / MGH), in addition to slides from the public GTEx consortium. - **Repository:** - **Preprint:** - **License:** CC-BY-NC-ND-4.0 ### Requirements ### Model Usage TITAN-preview is a vision-lanuage model trained on CONCH v1.5 patch features with patch size of 512x512 pixels at 20x magnification. Following authentication (using ), both TITAN-preview (slide and language encoders) and CONCH v1.5 (patch encoder) can be loaded using the commands below: You can directly use TITAN-preview for slide-level feature extaction. TITAN builds a feature grids from CONCH v1.5 patch features using the coordinates and the distance between the patches. As patch coordinates are always saved at the slides' level 0 magnification, TITAN takes patch_size_lv0 which represents the distance between two adjacent patches at level 0 magnification. It is 1024 if slide is 40x, or 512 if slide is 20x. We have this info saved in our demo TCGA features. Slide-level feature extraction can be done in the following way: These pre-extracted features can then be used for slide-level classification (via linear probing), retrieval (via l2 distance), and other machine learning settings, without task-specific finetuning. We also released all TCGA TITAN-preview features in . We demonstrated more detailed linear probe and zero-shot evaluation in our github. ## License and Terms of Use This model and associated code are released under the CC-BY-NC-ND 4.0 license and may only be used for non-commercial, academic research purposes with proper attribution. Any commercial use, sale, or other monetization of the TITAN model and its derivatives, which include models trained on outputs from the TITAN model or datasets created from the TITAN model, is prohibited and requires prior approval. Downloading the model requires prior registration on Hugging Face and agreeing to the terms of use. By downloading this model, you agree not to distribute, publish or reproduce a copy of the model. If another user within your organization wishes to use the TITAN model, they must register as an individual user and agree to comply with the terms of use. Users may not attempt to re-identify the deidentified data used to develop the underlying model. If you are a commercial entity, please contact the corresponding author. ## Contact For any additional questions or comments, contact Faisal Mahmood (), \\ Tong Ding (), \\ Sophia J. Wagner (), \\ Andrew H. Song (), \\ or Richard J. Chen (), ## Acknowledgements The project was built on top of amazing repositories such as ViT, iBOT, OpenClip, LGSSL, and Timm (ViT model implementation). We thank the authors and developers for their contribution. ## BibTeX If you found our work useful in your research, please consider citing our work at: Ding, T.\\*, Wagner S.J.\\*, Song, A.H.\\*, Chen, R.J.\\* et al. Multimodal Whole Slide Foundation Model for Pathology, Arxiv, 2024",
+ "model_explanation_gemini": "TITAN is a multimodal foundation model for pathology, combining visual self-supervised learning and vision-language alignment to analyze whole-slide images and generate pathology reports, achieving state-of-the-art performance in tasks like classification, retrieval, and report generation.\n\nFeatures: \n- Multimodal (image + text) pathology model \n- Pretrained on 335,645 WSIs and 182,000+ reports \n- Self-supervised ViT architecture \n- Supports zero-shot/few-shot learning \n-"
+}
\ No newline at end of file
diff --git a/model_data_json/MoritzLaurer_bge-m3-zeroshot-v2.0.json b/model_data_json/MoritzLaurer_bge-m3-zeroshot-v2.0.json
new file mode 100644
index 0000000000000000000000000000000000000000..1ed715fcfa5626f149d041a2d42ac94dbff99d29
--- /dev/null
+++ b/model_data_json/MoritzLaurer_bge-m3-zeroshot-v2.0.json
@@ -0,0 +1,22 @@
+{
+ "model_id": "MoritzLaurer/bge-m3-zeroshot-v2.0",
+ "downloads": 70245,
+ "tags": [
+ "transformers",
+ "onnx",
+ "safetensors",
+ "xlm-roberta",
+ "text-classification",
+ "zero-shot-classification",
+ "multilingual",
+ "arxiv:2312.17543",
+ "base_model:BAAI/bge-m3-retromae",
+ "base_model:quantized:BAAI/bge-m3-retromae",
+ "license:mit",
+ "autotrain_compatible",
+ "endpoints_compatible",
+ "region:us"
+ ],
+ "description": "--- language: - multilingual tags: - text-classification - zero-shot-classification base_model: BAAI/bge-m3-retromae pipeline_tag: zero-shot-classification library_name: transformers license: mit --- # Model description: bge-m3-zeroshot-v2.0 ## zeroshot-v2.0 series of models Models in this series are designed for efficient zeroshot classification with the Hugging Face pipeline. These models can do classification without training data and run on both GPUs and CPUs. An overview of the latest zeroshot classifiers is available in my Zeroshot Classifier Collection. The main update of this series of models is that several models are trained on fully commercially-friendly data for users with strict license requirements. These models can do one universal classification task: determine whether a hypothesis is \"true\" or \"not true\" given a text ( vs. ). This task format is based on the Natural Language Inference task (NLI). The task is so universal that any classification task can be reformulated into this task by the Hugging Face pipeline. ## Training data Models with a \"\" in the name are trained on two types of fully commercially-friendly data: 1. Synthetic data generated with Mixtral-8x7B-Instruct-v0.1. I first created a list of 500+ diverse text classification tasks for 25 professions in conversations with Mistral-large. The data was manually curated. I then used this as seed data to generate several hundred thousand texts for these tasks with Mixtral-8x7B-Instruct-v0.1. The final dataset used is available in the synthetic_zeroshot_mixtral_v0.1 dataset in the subset . Data curation was done in multiple iterations and will be improved in future iterations. 2. Two commercially-friendly NLI datasets: (MNLI, FEVER-NLI). These datasets were added to increase generalization. 3. Models without a \"\" in the name also included a broader mix of training data with a broader mix of licenses: ANLI, WANLI, LingNLI, and all datasets in this list where . ## How to use the models forces the model to decide on only one class. enables the model to choose multiple classes. ## Metrics The models were evaluated on 28 different text classification tasks with the f1_macro metric. The main reference point is which is, at the time of writing (03.04.24), the most used commercially-friendly 0-shot classifier. !results_aggreg_v2.0 | | facebook/bart-large-mnli | roberta-base-zeroshot-v2.0-c | roberta-large-zeroshot-v2.0-c | deberta-v3-base-zeroshot-v2.0-c | deberta-v3-base-zeroshot-v2.0 (fewshot) | deberta-v3-large-zeroshot-v2.0-c | deberta-v3-large-zeroshot-v2.0 (fewshot) | bge-m3-zeroshot-v2.0-c | bge-m3-zeroshot-v2.0 (fewshot) | |:---------------------------|---------------------------:|-----------------------------:|------------------------------:|--------------------------------:|-----------------------------------:|---------------------------------:|------------------------------------:|-----------------------:|--------------------------:| | all datasets mean | 0.497 | 0.587 | 0.622 | 0.619 | 0.643 (0.834) | 0.676 | 0.673 (0.846) | 0.59 | (0.803) | | amazonpolarity (2) | 0.937 | 0.924 | 0.951 | 0.937 | 0.943 (0.961) | 0.952 | 0.956 (0.968) | 0.942 | (0.951) | | imdb (2) | 0.892 | 0.871 | 0.904 | 0.893 | 0.899 (0.936) | 0.923 | 0.918 (0.958) | 0.873 | (0.917) | | appreviews (2) | 0.934 | 0.913 | 0.937 | 0.938 | 0.945 (0.948) | 0.943 | 0.949 (0.962) | 0.932 | (0.954) | | yelpreviews (2) | 0.948 | 0.953 | 0.977 | 0.979 | 0.975 (0.989) | 0.988 | 0.985 (0.994) | 0.973 | (0.978) | | rottentomatoes (2) | 0.83 | 0.802 | 0.841 | 0.84 | 0.86 (0.902) | 0.869 | 0.868 (0.908) | 0.813 | (0.866) | | emotiondair (6) | 0.455 | 0.482 | 0.486 | 0.459 | 0.495 (0.748) | 0.499 | 0.484 (0.688) | 0.453 | (0.697) | | emocontext (4) | 0.497 | 0.555 | 0.63 | 0.59 | 0.592 (0.799) | 0.699 | 0.676 (0.81) | 0.61 | (0.798) | | empathetic (32) | 0.371 | 0.374 | 0.404 | 0.378 | 0.405 (0.53) | 0.447 | 0.478 (0.555) | 0.387 | (0.455) | | financialphrasebank (3) | 0.465 | 0.562 | 0.455 | 0.714 | 0.669 (0.906) | 0.691 | 0.582 (0.913) | 0.504 | (0.895) | | banking77 (72) | 0.312 | 0.124 | 0.29 | 0.421 | 0.446 (0.751) | 0.513 | 0.567 (0.766) | 0.387 | (0.715) | | massive (59) | 0.43 | 0.428 | 0.543 | 0.512 | 0.52 (0.755) | 0.526 | 0.518 (0.789) | 0.414 | (0.692) | | wikitoxic_toxicaggreg (2) | 0.547 | 0.751 | 0.766 | 0.751 | 0.769 (0.904) | 0.741 | 0.787 (0.911) | 0.736 | (0.9) | | wikitoxic_obscene (2) | 0.713 | 0.817 | 0.854 | 0.853 | 0.869 (0.922) | 0.883 | 0.893 (0.933) | 0.783 | (0.914) | | wikitoxic_threat (2) | 0.295 | 0.71 | 0.817 | 0.813 | 0.87 (0.946) | 0.827 | 0.879 (0.952) | 0.68 | (0.947) | | wikitoxic_insult (2) | 0.372 | 0.724 | 0.798 | 0.759 | 0.811 (0.912) | 0.77 | 0.779 (0.924) | 0.783 | (0.915) | | wikitoxic_identityhate (2) | 0.473 | 0.774 | 0.798 | 0.774 | 0.765 (0.938) | 0.797 | 0.806 (0.948) | 0.761 | (0.931) | | hateoffensive (3) | 0.161 | 0.352 | 0.29 | 0.315 | 0.371 (0.862) | 0.47 | 0.461 (0.847) | 0.291 | (0.823) | | hatexplain (3) | 0.239 | 0.396 | 0.314 | 0.376 | 0.369 (0.765) | 0.378 | 0.389 (0.764) | 0.29 | (0.729) | | biasframes_offensive (2) | 0.336 | 0.571 | 0.583 | 0.544 | 0.601 (0.867) | 0.644 | 0.656 (0.883) | 0.541 | (0.855) | | biasframes_sex (2) | 0.263 | 0.617 | 0.835 | 0.741 | 0.809 (0.922) | 0.846 | 0.815 (0.946) | 0.748 | (0.905) | | biasframes_intent (2) | 0.616 | 0.531 | 0.635 | 0.554 | 0.61 (0.881) | 0.696 | 0.687 (0.891) | 0.467 | (0.868) | | agnews (4) | 0.703 | 0.758 | 0.745 | 0.68 | 0.742 (0.898) | 0.819 | 0.771 (0.898) | 0.687 | (0.892) | | yahootopics (10) | 0.299 | 0.543 | 0.62 | 0.578 | 0.564 (0.722) | 0.621 | 0.613 (0.738) | 0.587 | (0.711) | | trueteacher (2) | 0.491 | 0.469 | 0.402 | 0.431 | 0.479 (0.82) | 0.459 | 0.538 (0.846) | 0.471 | (0.518) | | spam (2) | 0.505 | 0.528 | 0.504 | 0.507 | 0.464 (0.973) | 0.74 | 0.597 (0.983) | 0.441 | (0.978) | | wellformedquery (2) | 0.407 | 0.333 | 0.333 | 0.335 | 0.491 (0.769) | 0.334 | 0.429 (0.815) | 0.361 | (0.718) | | manifesto (56) | 0.084 | 0.102 | 0.182 | 0.17 | 0.187 (0.376) | 0.258 | 0.256 (0.408) | 0.147 | (0.331) | | capsotu (21) | 0.34 | 0.479 | 0.523 | 0.502 | 0.477 (0.664) | 0.603 | 0.502 (0.686) | 0.472 | (0.644) | These numbers indicate zeroshot performance, as no data from these datasets was added in the training mix. Note that models without a \"\" in the title were evaluated twice: one run without any data from these 28 datasets to test pure zeroshot performance (the first number in the respective column) and the final run including up to 500 training data points per class from each of the 28 datasets (the second number in brackets in the column, \"fewshot\"). No model was trained on test data. Details on the different datasets are available here: ## When to use which model - **deberta-v3-zeroshot vs. roberta-zeroshot**: deberta-v3 performs clearly better than roberta, but it is a bit slower. roberta is directly compatible with Hugging Face's production inference TEI containers and flash attention. These containers are a good choice for production use-cases. tl;dr: For accuracy, use a deberta-v3 model. If production inference speed is a concern, you can consider a roberta model (e.g. in a TEI container and HF Inference Endpoints). - **commercial use-cases**: models with \"\" in the title are guaranteed to be trained on only commercially-friendly data. Models without a \"\" were trained on more data and perform better, but include data with non-commercial licenses. Legal opinions diverge if this training data affects the license of the trained model. For users with strict legal requirements, the models with \"\" in the title are recommended. - **Multilingual/non-English use-cases**: use bge-m3-zeroshot-v2.0 or bge-m3-zeroshot-v2.0-c. Note that multilingual models perform worse than English-only models. You can therefore also first machine translate your texts to English with libraries like EasyNMT and then apply any English-only model to the translated data. Machine translation also facilitates validation in case your team does not speak all languages in the data. - **context window**: The models can process up to 8192 tokens. The other models can process up to 512. Note that longer text inputs both make the mode slower and decrease performance, so if you're only working with texts of up to 400~ words / 1 page, use e.g. a deberta model for better performance. - The latest updates on new models are always available in the Zeroshot Classifier Collection. ## Reproduction Reproduction code is available in the directory here: ## Limitations and bias The model can only do text classification tasks. Biases can come from the underlying foundation model, the human NLI training data and the synthetic data generated by Mixtral. ## License The foundation model was published under the MIT license. The licenses of the training data vary depending on the model, see above. ## Citation This model is an extension of the research described in this paper. If you use this model academically, please cite: ### Ideas for cooperation or questions? If you have questions or ideas for cooperation, contact me at moritz{at}huggingface{dot}co or LinkedIn ### Flexible usage and \"prompting\" You can formulate your own hypotheses by changing the of the zeroshot pipeline. Similar to \"prompt engineering\" for LLMs, you can test different formulations of your and verbalized classes to improve performance.",
+ "model_explanation_gemini": "A multilingual zero-shot text classification model trained on commercially-friendly data to determine if a hypothesis is \"true\" or \"not true\" for a given text, reformulating any classification task into a Natural Language Inference (NLI) format. \n\n**Features:** \n- Zero-shot classification (no training data required) \n- Supports GPU and CPU inference \n- Commercially-friendly training data (synthetic Mixtral-generated data + MNLI/FEVER-NLI) \n- Handles single-label"
+}
\ No newline at end of file
diff --git a/model_data_json/NbAiLab_nb-wav2vec2-300m-nynorsk.json b/model_data_json/NbAiLab_nb-wav2vec2-300m-nynorsk.json
new file mode 100644
index 0000000000000000000000000000000000000000..d16f808f58929d2f9dad103460f42dd632423d05
--- /dev/null
+++ b/model_data_json/NbAiLab_nb-wav2vec2-300m-nynorsk.json
@@ -0,0 +1,21 @@
+{
+ "model_id": "NbAiLab/nb-wav2vec2-300m-nynorsk",
+ "downloads": 73025,
+ "tags": [
+ "transformers",
+ "pytorch",
+ "tensorboard",
+ "safetensors",
+ "wav2vec2",
+ "automatic-speech-recognition",
+ "nn",
+ "dataset:NbAiLab/NPSC",
+ "arxiv:2307.01672",
+ "license:apache-2.0",
+ "model-index",
+ "endpoints_compatible",
+ "region:us"
+ ],
+ "description": "--- license: apache-2.0 tags: - automatic-speech-recognition datasets: - NbAiLab/NPSC language: - nn model-index: - name: nb-wav2vec2-300m-nynorsk results: - task: name: Automatic Speech Recognition type: automatic-speech-recognition dataset: name: NPSC type: NbAiLab/NPSC args: 16K_mp3_nynorsk metrics: - name: Test (Nynorsk) WER type: wer value: 0.1222 - name: Test (Nynorsk) CER type: cer value: 0.0419 --- # Norwegian Wav2Vec2 Model - 300M - VoxRex - Nynorsk This model is finetuned on top of feature extractor VoxRex-model from the National Library of Sweden. The finetuned model achieves the following results on the test set with a 5-gram KenLM. The numbers in parentheses are the results without the language model: - **WER: 0.1222** (0.1537) - **CER: 0.0419** (0.0468) ## Model description This is one of several Wav2Vec-models our team created during the 🤗 hosted Robust Speech Event. This is the complete list of our models and their final scores: | Model | Final WER | | |:--------------|:------------|:------------:| | NbAiLab/nb-wav2vec2-1b-bokmaal | 6.33 | | | NbAiLab/nb-wav2vec2-300m-bokmaal | 7.03 | | | NbAiLab/nb-wav2vec2-1b-nynorsk | 11.32 | | | NbAiLab/nb-wav2vec2-300m-nynorsk (this model) | 12.22 | | ### Dataset In parallel with the event, the team also converted the Norwegian Parliamentary Speech Corpus (NPSC) to the NbAiLab/NPSC in 🤗 Dataset format and used that as the main source for training. ## Code We have released all the code developed during the event so that the Norwegian NLP community can build upon it when developing even better Norwegian ASR models. The finetuning of these models is not very computationally demanding. After following the instructions here, you should be able to train your own automatic speech recognition system in less than a day with an average GPU. ## Team The following people contributed to building this model: Rolv-Arild Braaten, Per Egil Kummervold, Andre Kåsen, Javier de la Rosa, Per Erik Solberg, and Freddy Wetjen. ## Training procedure To reproduce these results, we strongly recommend that you follow the instructions from 🤗 to train a simple Swedish model. When you have verified that you are able to do this, create a fresh new repo. You can then start by copying the files and from our repo. Running these will create all the other necessary files, and should let you reproduce our results. With some tweaks to the hyperparameters, you might even be able to build an even better ASR. Good luck! ### Language Model As the scores indicate, adding even a simple 5-gram language will improve the results. 🤗 has provided another very nice blog explaining how to add a 5-gram language model to improve the ASR model. You can build this from your own corpus, for instance by extracting some suitable text from the Norwegian Colossal Corpus. You can also skip some of the steps in the guide, and copy the 5-gram model from this repo. ### Parameters The final model was run using these parameters: Using these settings, the training might take 3-4 days on an average GPU. You can, however, get a decent model and faster results by tweaking these parameters. | Parameter| Comment | |:-------------|:-----| | per_device_train_batch_size | Adjust this to the maximum of available memory. 16 or 24 might be good settings depending on your system | |gradient_accumulation_steps |Can be adjusted even further up to increase batch size and speed up training without running into memory issues | | learning_rate|Can be increased, maybe as high as 1e-4. Speeds up training but might add instability | | epochs| Can be decreased significantly. This is a huge dataset and you might get a decent result already after a couple of epochs| ## Citation See",
+ "model_explanation_gemini": "Finetuned for automatic speech recognition in Nynorsk, this model achieves a 12.22% WER on the NPSC dataset, enhanced by a 5-gram KenLM.\n\nFeatures: \n- **Task:** Automatic Speech Recognition (ASR) \n- **Base Model:** VoxRex from the National Library of Sweden \n- **Dataset:** NbAiLab/NPSC (Norwegian Parliamentary Speech Corpus, Nynorsk subset) \n- **Metrics:** \n - W"
+}
\ No newline at end of file
diff --git a/model_data_json/OpenGVLab_InternVL2_5-38B-MPO.json b/model_data_json/OpenGVLab_InternVL2_5-38B-MPO.json
new file mode 100644
index 0000000000000000000000000000000000000000..8282c381fa3958b88f8e009207630b6118f2ce49
--- /dev/null
+++ b/model_data_json/OpenGVLab_InternVL2_5-38B-MPO.json
@@ -0,0 +1,27 @@
+{
+ "model_id": "OpenGVLab/InternVL2_5-38B-MPO",
+ "downloads": 78877,
+ "tags": [
+ "transformers",
+ "tensorboard",
+ "safetensors",
+ "internvl_chat",
+ "feature-extraction",
+ "internvl",
+ "custom_code",
+ "image-text-to-text",
+ "conversational",
+ "multilingual",
+ "dataset:OpenGVLab/MMPR-v1.1",
+ "arxiv:2312.14238",
+ "arxiv:2404.16821",
+ "arxiv:2412.05271",
+ "arxiv:2411.10442",
+ "base_model:OpenGVLab/InternVL2_5-38B",
+ "base_model:finetune:OpenGVLab/InternVL2_5-38B",
+ "license:mit",
+ "region:us"
+ ],
+ "description": "--- license: mit pipeline_tag: image-text-to-text library_name: transformers base_model: - OpenGVLab/InternVL2_5-38B base_model_relation: finetune datasets: - OpenGVLab/MMPR-v1.1 language: - multilingual tags: - internvl - custom_code --- # InternVL2_5-38B-MPO [\\[📂 GitHub\\]]( [\\[📜 InternVL 1.0\\]]( [\\[📜 InternVL 1.5\\]]( [\\[📜 InternVL 2.5\\]]( [\\[📜 InternVL2.5-MPO\\]]( [\\[🆕 Blog\\]]( [\\[🗨️ Chat Demo\\]]( [\\[🤗 HF Demo\\]]( [\\[🚀 Quick Start\\]](#quick-start) [\\[📖 Documents\\]](
## Introduction We introduce InternVL2.5-MPO, an advanced multimodal large language model (MLLM) series that demonstrates superior overall performance. This series builds upon InternVL2.5 and Mixed Preference Optimization. !image/png ## InternVL 2.5 Family In the following table, we provide an overview of the InternVL2.5-MPO series. | Model Name | Vision Part | Language Part | HF Link | | :-----------------: | :-------------------------------------------------------------------------------------: | :----------------------------------------------------------------------------: | :------------------------------------------------------------: | | InternVL2_5-1B-MPO | InternViT-300M-448px-V2_5 | Qwen2.5-0.5B-Instruct | 🤗 link | | InternVL2_5-2B-MPO | InternViT-300M-448px-V2_5 | internlm2_5-1_8b-chat | 🤗 link | | InternVL2_5-4B-MPO | InternViT-300M-448px-V2_5 | Qwen2.5-3B-Instruct | 🤗 link | | InternVL2_5-8B-MPO | InternViT-300M-448px-V2_5 | internlm2_5-7b-chat | 🤗 link | | InternVL2_5-26B-MPO | InternViT-6B-448px-V2_5 | internlm2_5-20b-chat | 🤗 link | | InternVL2_5-38B-MPO | InternViT-6B-448px-V2_5 | Qwen2.5-32B-Instruct | 🤗 link | | InternVL2_5-78B-MPO | InternViT-6B-448px-V2_5 | Qwen2.5-72B-Instruct | 🤗 link | ## Model Architecture As shown in the following figure, InternVL2.5-MPO retains the same model architecture as InternVL 2.5 and its predecessors, InternVL 1.5 and 2.0, following the \"ViT-MLP-LLM\" paradigm. In this new version, we integrate a newly incrementally pre-trained InternViT with various pre-trained LLMs, including InternLM 2.5 and Qwen 2.5, using a randomly initialized MLP projector. !image/png As in the previous version, we applied a pixel unshuffle operation, reducing the number of visual tokens to one-quarter of the original. Besides, we adopted a similar dynamic resolution strategy as InternVL 1.5, dividing images into tiles of 448×448 pixels. The key difference, starting from InternVL 2.0, is that we additionally introduced support for multi-image and video data. ## Key Designs ### Multi-Modal Preference Dataset MMPR is a large-scale and high-quality multimodal reasoning preference dataset. This dataset includes about 3 million samples. !image/jpeg !image/jpeg To construct this dataset, we propose an efficient data construction pipeline. Specifically, we categorize the multimodal data into **samples with clear ground truths** and **samples without clear ground truths**. - **For samples with clear ground truths:** the model is prompted to first provide the reasoning process and then give the final answer in the format like . Responses matching the ground truth answer constitute the positive set \\\\(\\mathcal{Y}_p\\\\), while those that do not match make up the negative set \\\\(\\mathcal{Y}_n\\\\). Additionally, responses that fail to provide a clear final answer are also merged into \\\\(\\mathcal{Y}_n\\\\). Given these responses labeled as positive or negative, we build the preference pairs by selecting a chosen response \\\\(y_c\\\\) from \\\\(\\mathcal{Y}_p\\\\) and a negative response \\\\(y_r\\\\) from \\\\(\\mathcal{Y}_n\\\\). - **For samples without clear ground truths:** we propose a simple yet effective method: Dropout Next-Token Prediction (Dropout NTP). Specifically, we use the responses generated by InternVL2-8B as chosen answers. Given the chosen answer, we truncate it by half and then prompt InternVL2-8B to complete the remaining portion of the truncated answer without access to the image input. This generated completion serves as the rejected answer for the paired sample. It is worth noting that while the responses generated by InternVL2-8B may not be perfect, the completions generated without the image input will introduce more hallucinations than those generated with the image input. Therefore, the partial order relationship between the chosen and rejected responses holds true. The data construction pipeline is open-sourced, see more details in our document. ### Mixed Preference Optimization The key insight behind MPO is that *an effective PO process should enable the model to learn the relative preference between pairs of responses, the absolute quality of individual responses, and the process for generating preferred responses.* We define the training objective as a combination of preference loss \\\\(\\mathcal{L}_{\\text{p}}\\\\), quality loss \\\\(\\mathcal{L}_{\\text{q}}\\\\), and generation loss \\\\(\\mathcal{L}_{\\text{g}}\\\\), referred to as Mixed Preference Optimization: $$ \\mathcal{L}=w_{p}\\cdot\\mathcal{L}_{\\text{p}} + w_{q}\\cdot\\mathcal{L}_{\\text{q}} + w_{g}\\cdot\\mathcal{L}_{\\text{g}}, $$ where \\\\(w_{*}\\\\) represents the weight assigned to each loss component. In this work, we empirically compare different variants of preference loss. Based on the experimental results, we use DPO as our preference loss and BCO as our quality loss. Specifically, the DPO serves as the preference loss to enable the model to learn the relative preference between chosen and rejected responses. This algorithm optimizes the following loss function: $$ \\mathcal{L}_{\\text{p}}=-\\log \\sigma\\left(\\beta \\log \\frac{\\pi_\\theta\\left(y_c \\mid x\\right)}{\\pi_0\\left(y_c \\mid x\\right)}-\\beta \\log \\frac{\\pi_\\theta\\left(y_r \\mid x\\right)}{\\pi_0\\left(y_r \\mid x\\right)}\\right), $$ where \\\\(\\beta\\\\) is the KL penalty coefficient, and \\\\(x\\\\), \\\\(y_c\\\\), and \\\\(y_r\\\\) are user query, chosen response, and rejected response, respectively. The policy model \\\\(\\pi_\\theta\\\\) is initialized from model \\\\(\\pi_0\\\\). Additionally, the BCO loss is employed as the quality loss, which helps the model to understand the absolute quality of individual responses. The loss function is defined as: $$ \\mathcal{L}_{\\text{q}}=\\mathcal{L}_{\\text{q}}^+ + \\mathcal{L}_{\\text{q}}^-, $$ where \\\\(\\mathcal{L}_{\\text{q}}^{+}\\\\) and \\\\(\\mathcal{L}_{\\text{q}}^{+}\\\\) represent the loss for chosen and rejected responses, respectively. Each response type's loss is calculated independently, requiring the model to differentiate the absolute quality of individual responses. The loss terms are given by: $$ \\mathcal{L}_{\\text{q}}^+=-\\log \\sigma\\left(\\beta \\log \\frac{\\pi_\\theta\\left(y_c \\mid x\\right)}{\\pi_0\\left(y_c \\mid x\\right)} - \\delta\\right), $$ $$ \\mathcal{L}_{\\text{q}}^-=-\\log \\sigma\\left(-\\left(\\beta \\log \\frac{\\pi_\\theta\\left(y_r \\mid x\\right)}{\\pi_0\\left(y_r \\mid x\\right)} - \\delta\\right) \\right), $$ where \\\\(\\delta\\\\) represents the reward shift, calculated as the moving average of previous rewards to stabilize training. Finally, the SFT loss is used as the generation loss to help the model learn the generation process of preferred responses. The loss function is defined as: $$ \\mathcal{L}_{\\text{gen}}=-\\frac{\\log\\pi_\\theta\\left(y_c \\mid x\\right)}{\\left| y_c \\right|}. $$ ## Evaluation on Multimodal Capability To comprehensively compare InternVL's performance before and after MPO, we employ the benchmarks from OpenCompass Learderboard, including both well-established classic datasets and newly introduced ones. These benchmarks span a wide range of categories, aiming to provide a thorough and balanced assessment of InternVL’s capabilities across various multimodal tasks. We provide the evaluation results in the tables behind. | Model | Avg. | MMBench v1.1 | MMStar | MMMU | MathVista | HallusionBench | AI2D | OCRBench | MMVet | | ------------------- | ---- | ------------ | ------ | ---- | --------- | -------------- | ---- | -------- | ----- | | InternVL2-5-1B | 54.9 | 66.5 | 51.3 | 41.2 | 47.1 | 39.4 | 69.0 | 77.4 | 47.2 | | InternVL2-5-1B-MPO | 56.4 | 67.2 | 49.7 | 40.8 | 53.0 | 40.0 | 69.4 | 83.6 | 47.2 | | InternVL2-5-2B | 59.9 | 70.9 | 54.3 | 43.2 | 51.1 | 42.3 | 74.9 | 80.2 | 62.6 | | InternVL2-5-2B-MPO | 62.0 | 71.6 | 55.0 | 45.0 | 56.4 | 43.0 | 75.3 | 84.2 | 65.4 | | InternVL2-5-4B | 65.1 | 78.2 | 58.7 | 51.8 | 60.8 | 46.6 | 81.4 | 82.0 | 61.5 | | InternVL2-5-4B-MPO | 67.6 | 78.6 | 60.2 | 51.6 | 65.3 | 47.8 | 82.0 | 88.0 | 67.1 | | InternVL2-5-8B | 68.9 | 82.5 | 63.2 | 56.2 | 64.5 | 49.0 | 84.6 | 82.1 | 62.8 | | InternVL2-5-8B-MPO | 70.4 | 82.4 | 65.7 | 54.9 | 68.9 | 51.4 | 84.5 | 88.3 | 66.9 | | InternVL2-5-26B | 71.6 | 84.6 | 66.5 | 60.7 | 68.0 | 55.8 | 86.2 | 85.4 | 65.4 | | InternVL2-5-26B-MPO | 72.7 | 84.2 | 67.2 | 57.7 | 72.8 | 55.3 | 86.2 | 91.2 | 67.1 | | InternVL2-5-38B | 73.5 | 85.4 | 68.5 | 64.6 | 72.4 | 57.9 | 87.6 | 84.1 | 67.2 | | InternVL2-5-38B-MPO | 75.5 | 85.6 | 69.8 | 64.1 | 73.8 | 61.5 | 88.1 | 88.5 | 72.5 | | InternVL2-5-78B | 75.2 | 87.5 | 69.5 | 70.0 | 70.6 | 57.4 | 89.1 | 85.3 | 71.8 | | InternVL2-5-78B-MPO | 76.6 | 87.3 | 73.1 | 68.3 | 73.8 | 58.7 | 89.3 | 91.2 | 71.4 | ## Quick Start We provide an example code to run using . > Please use transformers>=4.37.2 to ensure the model works normally. ### Model Loading #### 16-bit (bf16 / fp16) #### BNB 8-bit Quantization #### Multiple GPUs The reason for writing the code this way is to avoid errors that occur during multi-GPU inference due to tensors not being on the same device. By ensuring that the first and last layers of the large language model (LLM) are on the same device, we prevent such errors. ### Inference with Transformers #### Streaming Output Besides this method, you can also use the following code to get streamed output. ## Finetune Many repositories now support fine-tuning of the InternVL series models, including InternVL, SWIFT, XTurner, and others. Please refer to their documentation for more details on fine-tuning. ## Deployment ### LMDeploy LMDeploy is a toolkit for compressing, deploying, and serving LLMs & VLMs. LMDeploy abstracts the complex inference process of multi-modal Vision-Language Models (VLM) into an easy-to-use pipeline, similar to the Large Language Model (LLM) inference pipeline. #### A 'Hello, world' Example If occurs while executing this case, please install the required dependency packages as prompted. #### Multi-images Inference When dealing with multiple images, you can put them all in one list. Keep in mind that multiple images will lead to a higher number of input tokens, and as a result, the size of the context window typically needs to be increased. #### Batch Prompts Inference Conducting inference with batch prompts is quite straightforward; just place them within a list structure: #### Multi-turn Conversation There are two ways to do the multi-turn conversations with the pipeline. One is to construct messages according to the format of OpenAI and use above introduced method, the other is to use the interface. #### Service LMDeploy's enables models to be easily packed into services with a single command. The provided RESTful APIs are compatible with OpenAI's interfaces. Below are an example of service startup: To use the OpenAI-style interface, you need to install OpenAI: Then, use the code below to make the API call: ## License This project is released under the MIT License. This project uses the pre-trained Qwen2.5-32B-Instruct as a component, which is licensed under the Apache License 2.0. ## Citation If you find this project useful in your research, please consider citing:",
+ "model_explanation_gemini": "**A multilingual multimodal large language model (MLLM) optimized for image-text-to-text tasks, built on InternVL2.5 with Mixed Preference Optimization for superior performance.** \n\n### **Features:** \n- **Multimodal Capability:** Processes both image and text inputs to generate text outputs. \n- **Multilingual Support:** Works across multiple languages. \n- **Advanced Architecture:** Uses a \"ViT-MLP-LLM\" paradigm with InternViT and LLMs like Qwen"
+}
\ No newline at end of file
diff --git a/model_data_json/OpenGVLab_InternVL3-2B.json b/model_data_json/OpenGVLab_InternVL3-2B.json
new file mode 100644
index 0000000000000000000000000000000000000000..4f09152f45d91ec97cdd119d76357b75d9478828
--- /dev/null
+++ b/model_data_json/OpenGVLab_InternVL3-2B.json
@@ -0,0 +1,28 @@
+{
+ "model_id": "OpenGVLab/InternVL3-2B",
+ "downloads": 80574,
+ "tags": [
+ "transformers",
+ "safetensors",
+ "internvl_chat",
+ "feature-extraction",
+ "internvl",
+ "custom_code",
+ "image-text-to-text",
+ "conversational",
+ "multilingual",
+ "dataset:OpenGVLab/MMPR-v1.2",
+ "arxiv:2312.14238",
+ "arxiv:2404.16821",
+ "arxiv:2412.05271",
+ "arxiv:2411.10442",
+ "arxiv:2504.10479",
+ "arxiv:2412.09616",
+ "base_model:OpenGVLab/InternVL3-2B-Instruct",
+ "base_model:finetune:OpenGVLab/InternVL3-2B-Instruct",
+ "license:apache-2.0",
+ "region:us"
+ ],
+ "description": "--- license: apache-2.0 license_name: qwen license_link: pipeline_tag: image-text-to-text library_name: transformers base_model: - OpenGVLab/InternVL3-2B-Instruct base_model_relation: finetune datasets: - OpenGVLab/MMPR-v1.2 language: - multilingual tags: - internvl - custom_code --- # InternVL3-2B [\\[📂 GitHub\\]]( [\\[📜 InternVL 1.0\\]]( [\\[📜 InternVL 1.5\\]]( [\\[📜 InternVL 2.5\\]]( [\\[📜 InternVL2.5-MPO\\]]( [\\[📜 InternVL3\\]]( [\\[🆕 Blog\\]]( [\\[🗨️ Chat Demo\\]]( [\\[🤗 HF Demo\\]]( [\\[🚀 Quick Start\\]](#quick-start) [\\[📖 Documents\\]](
**Supported languages:** English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai. **Llama 3.1 family of models**. Token counts refer to pretraining data only. All model versions use Grouped-Query Attention (GQA) for improved inference scalability. **Model Release Date:** July 23, 2024. **Status:** This is a static model trained on an offline dataset. Future versions of the tuned models will be released as we improve model safety with community feedback. **License:** A custom commercial license, the Llama 3.1 Community License, is available at: Where to send questions or comments about the model Instructions on how to provide feedback or comments on the model can be found in the model README. For more technical information about generation parameters and recipes for how to use Llama 3.1 in applications, please go here. ## Intended Use **Intended Use Cases** Llama 3.1 is intended for commercial and research use in multiple languages. Instruction tuned text only models are intended for assistant-like chat, whereas pretrained models can be adapted for a variety of natural language generation tasks. The Llama 3.1 model collection also supports the ability to leverage the outputs of its models to improve other models including synthetic data generation and distillation. The Llama 3.1 Community License allows for these use cases. **Out-of-scope** Use in any manner that violates applicable laws or regulations (including trade compliance laws). Use in any other way that is prohibited by the Acceptable Use Policy and Llama 3.1 Community License. Use in languages beyond those explicitly referenced as supported in this model card**. **Note: Llama 3.1 has been trained on a broader collection of languages than the 8 supported languages. Developers may fine-tune Llama 3.1 models for languages beyond the 8 supported languages provided they comply with the Llama 3.1 Community License and the Acceptable Use Policy and in such cases are responsible for ensuring that any uses of Llama 3.1 in additional languages is done in a safe and responsible manner. ## How to use This repository contains two versions of Meta-Llama-3.1-8B-Instruct, for use with transformers and with the original codebase. ### Use with transformers Starting with onward, you can run conversational inference using the Transformers abstraction or by leveraging the Auto classes with the function. Make sure to update your transformers installation via . Note: You can also find detailed recipes on how to use the model locally, with , assisted generations, quantised and more at []( ### Use with Please, follow the instructions in the repository To download Original checkpoints, see the example command below leveraging : ## Hardware and Software **Training Factors** We used custom training libraries, Meta's custom built GPU cluster, and production infrastructure for pretraining. Fine-tuning, annotation, and evaluation were also performed on production infrastructure. **Training utilized a cumulative of** 39.3M GPU hours of computation on H100-80GB (TDP of 700W) type hardware, per the table below. Training time is the total GPU time required for training each model and power consumption is the peak power capacity per GPU device used, adjusted for power usage efficiency. **Training Greenhouse Gas Emissions** Estimated total location-based greenhouse gas emissions were **11,390** tons CO2eq for training. Since 2020, Meta has maintained net zero greenhouse gas emissions in its global operations and matched 100% of its electricity use with renewable energy, therefore the total market-based greenhouse gas emissions for training were 0 tons CO2eq.
Training Time (GPU hours)
Training Power Consumption (W)
Training Location-Based Greenhouse Gas Emissions
(tons CO2eq)
Training Market-Based Greenhouse Gas Emissions
(tons CO2eq)
Llama 3.1 8B
1.46M
700
420
0
Llama 3.1 70B
7.0M
700
2,040
0
Llama 3.1 405B
30.84M
700
8,930
0
Total
39.3M
11,390
0
The methodology used to determine training energy use and greenhouse gas emissions can be found here. Since Meta is openly releasing these models, the training energy use and greenhouse gas emissions will not be incurred by others. ## Training Data **Overview:** Llama 3.1 was pretrained on ~15 trillion tokens of data from publicly available sources. The fine-tuning data includes publicly available instruction datasets, as well as over 25M synthetically generated examples. **Data Freshness:** The pretraining data has a cutoff of December 2023. ## Benchmark scores In this section, we report the results for Llama 3.1 models on standard automatic benchmarks. For all the evaluations, we use our internal evaluations library. ### Base pretrained models
Category
Benchmark
# Shots
Metric
Llama 3 8B
Llama 3.1 8B
Llama 3 70B
Llama 3.1 70B
Llama 3.1 405B
General
MMLU
5
macro_avg/acc_char
66.7
66.7
79.5
79.3
85.2
MMLU-Pro (CoT)
5
macro_avg/acc_char
36.2
37.1
55.0
53.8
61.6
AGIEval English
3-5
average/acc_char
47.1
47.8
63.0
64.6
71.6
CommonSenseQA
7
acc_char
72.6
75.0
83.8
84.1
85.8
Winogrande
5
acc_char
-
60.5
-
83.3
86.7
BIG-Bench Hard (CoT)
3
average/em
61.1
64.2
81.3
81.6
85.9
ARC-Challenge
25
acc_char
79.4
79.7
93.1
92.9
96.1
Knowledge reasoning
TriviaQA-Wiki
5
em
78.5
77.6
89.7
89.8
91.8
Reading comprehension
SQuAD
1
em
76.4
77.0
85.6
81.8
89.3
QuAC (F1)
1
f1
44.4
44.9
51.1
51.1
53.6
BoolQ
0
acc_char
75.7
75.0
79.0
79.4
80.0
DROP (F1)
3
f1
58.4
59.5
79.7
79.6
84.8
### Instruction tuned models
Category
Benchmark
# Shots
Metric
Llama 3 8B Instruct
Llama 3.1 8B Instruct
Llama 3 70B Instruct
Llama 3.1 70B Instruct
Llama 3.1 405B Instruct
General
MMLU
5
macro_avg/acc
68.5
69.4
82.0
83.6
87.3
MMLU (CoT)
0
macro_avg/acc
65.3
73.0
80.9
86.0
88.6
MMLU-Pro (CoT)
5
micro_avg/acc_char
45.5
48.3
63.4
66.4
73.3
IFEval
76.8
80.4
82.9
87.5
88.6
Reasoning
ARC-C
0
acc
82.4
83.4
94.4
94.8
96.9
GPQA
0
em
34.6
30.4
39.5
41.7
50.7
Code
HumanEval
0
pass@1
60.4
72.6
81.7
80.5
89.0
MBPP ++ base version
0
pass@1
70.6
72.8
82.5
86.0
88.6
Multipl-E HumanEval
0
pass@1
-
50.8
-
65.5
75.2
Multipl-E MBPP
0
pass@1
-
52.4
-
62.0
65.7
Math
GSM-8K (CoT)
8
em_maj1@1
80.6
84.5
93.0
95.1
96.8
MATH (CoT)
0
final_em
29.1
51.9
51.0
68.0
73.8
Tool Use
API-Bank
0
acc
48.3
82.6
85.1
90.0
92.0
BFCL
0
acc
60.3
76.1
83.0
84.8
88.5
Gorilla Benchmark API Bench
0
acc
1.7
8.2
14.7
29.7
35.3
Nexus (0-shot)
0
macro_avg/acc
18.1
38.5
47.8
56.7
58.7
Multilingual
Multilingual MGSM (CoT)
0
em
-
68.9
-
86.9
91.6
#### Multilingual benchmarks
Category
Benchmark
Language
Llama 3.1 8B
Llama 3.1 70B
Llama 3.1 405B
General
MMLU (5-shot, macro_avg/acc)
Portuguese
62.12
80.13
84.95
Spanish
62.45
80.05
85.08
Italian
61.63
80.4
85.04
German
60.59
79.27
84.36
French
62.34
79.82
84.66
Hindi
50.88
74.52
80.31
Thai
50.32
72.95
78.21
## Responsibility & Safety As part of our Responsible release approach, we followed a three-pronged strategy to managing trust & safety risks: * Enable developers to deploy helpful, safe and flexible experiences for their target audience and for the use cases supported by Llama. * Protect developers against adversarial users aiming to exploit Llama capabilities to potentially cause harm. * Provide protections for the community to help prevent the misuse of our models. ### Responsible deployment Llama is a foundational technology designed to be used in a variety of use cases, examples on how Meta’s Llama models have been responsibly deployed can be found in our Community Stories webpage. Our approach is to build the most helpful models enabling the world to benefit from the technology power, by aligning our model safety for the generic use cases addressing a standard set of harms. Developers are then in the driver seat to tailor safety for their use case, defining their own policy and deploying the models with the necessary safeguards in their Llama systems. Llama 3.1 was developed following the best practices outlined in our Responsible Use Guide, you can refer to the Responsible Use Guide to learn more. #### Llama 3.1 instruct Our main objectives for conducting safety fine-tuning are to provide the research community with a valuable resource for studying the robustness of safety fine-tuning, as well as to offer developers a readily available, safe, and powerful model for various applications to reduce the developer workload to deploy safe AI systems. For more details on the safety mitigations implemented please read the Llama 3 paper. **Fine-tuning data** We employ a multi-faceted approach to data collection, combining human-generated data from our vendors with synthetic data to mitigate potential safety risks. We’ve developed many large language model (LLM)-based classifiers that enable us to thoughtfully select high-quality prompts and responses, enhancing data quality control. **Refusals and Tone** Building on the work we started with Llama 3, we put a great emphasis on model refusals to benign prompts as well as refusal tone. We included both borderline and adversarial prompts in our safety data strategy, and modified our safety data responses to follow tone guidelines. #### Llama 3.1 systems **Large language models, including Llama 3.1, are not designed to be deployed in isolation but instead should be deployed as part of an overall AI system with additional safety guardrails as required.** Developers are expected to deploy system safeguards when building agentic systems. Safeguards are key to achieve the right helpfulness-safety alignment as well as mitigating safety and security risks inherent to the system and any integration of the model or system with external tools. As part of our responsible release approach, we provide the community with safeguards that developers should deploy with Llama models or other LLMs, including Llama Guard 3, Prompt Guard and Code Shield. All our reference implementations demos contain these safeguards by default so developers can benefit from system-level safety out-of-the-box. #### New capabilities Note that this release introduces new capabilities, including a longer context window, multilingual inputs and outputs and possible integrations by developers with third party tools. Building with these new capabilities requires specific considerations in addition to the best practices that generally apply across all Generative AI use cases. **Tool-use**: Just like in standard software development, developers are responsible for the integration of the LLM with the tools and services of their choice. They should define a clear policy for their use case and assess the integrity of the third party services they use to be aware of the safety and security limitations when using this capability. Refer to the Responsible Use Guide for best practices on the safe deployment of the third party safeguards. **Multilinguality**: Llama 3.1 supports 7 languages in addition to English: French, German, Hindi, Italian, Portuguese, Spanish, and Thai. Llama may be able to output text in other languages than those that meet performance thresholds for safety and helpfulness. We strongly discourage developers from using this model to converse in non-supported languages without implementing finetuning and system controls in alignment with their policies and the best practices shared in the Responsible Use Guide. ### Evaluations We evaluated Llama models for common use cases as well as specific capabilities. Common use cases evaluations measure safety risks of systems for most commonly built applications including chat bot, coding assistant, tool calls. We built dedicated, adversarial evaluation datasets and evaluated systems composed of Llama models and Llama Guard 3 to filter input prompt and output response. It is important to evaluate applications in context, and we recommend building dedicated evaluation dataset for your use case. Prompt Guard and Code Shield are also available if relevant to the application. Capability evaluations measure vulnerabilities of Llama models inherent to specific capabilities, for which were crafted dedicated benchmarks including long context, multilingual, tools calls, coding or memorization. **Red teaming** For both scenarios, we conducted recurring red teaming exercises with the goal of discovering risks via adversarial prompting and we used the learnings to improve our benchmarks and safety tuning datasets. We partnered early with subject-matter experts in critical risk areas to understand the nature of these real-world harms and how such models may lead to unintended harm for society. Based on these conversations, we derived a set of adversarial goals for the red team to attempt to achieve, such as extracting harmful information or reprogramming the model to act in a potentially harmful capacity. The red team consisted of experts in cybersecurity, adversarial machine learning, responsible AI, and integrity in addition to multilingual content specialists with background in integrity issues in specific geographic markets. ### Critical and other risks We specifically focused our efforts on mitigating the following critical risk areas: **1- CBRNE (Chemical, Biological, Radiological, Nuclear, and Explosive materials) helpfulness** To assess risks related to proliferation of chemical and biological weapons, we performed uplift testing designed to assess whether use of Llama 3.1 models could meaningfully increase the capabilities of malicious actors to plan or carry out attacks using these types of weapons. **2. Child Safety** Child Safety risk assessments were conducted using a team of experts, to assess the model’s capability to produce outputs that could result in Child Safety risks and inform on any necessary and appropriate risk mitigations via fine tuning. We leveraged those expert red teaming sessions to expand the coverage of our evaluation benchmarks through Llama 3 model development. For Llama 3, we conducted new in-depth sessions using objective based methodologies to assess the model risks along multiple attack vectors including the additional languages Llama 3 is trained on. We also partnered with content specialists to perform red teaming exercises assessing potentially violating content while taking account of market specific nuances or experiences. **3. Cyber attack enablement** Our cyber attack uplift study investigated whether LLMs can enhance human capabilities in hacking tasks, both in terms of skill level and speed. Our attack automation study focused on evaluating the capabilities of LLMs when used as autonomous agents in cyber offensive operations, specifically in the context of ransomware attacks. This evaluation was distinct from previous studies that considered LLMs as interactive assistants. The primary objective was to assess whether these models could effectively function as independent agents in executing complex cyber-attacks without human intervention. Our study of Llama-3.1-405B’s social engineering uplift for cyber attackers was conducted to assess the effectiveness of AI models in aiding cyber threat actors in spear phishing campaigns. Please read our Llama 3.1 Cyber security whitepaper to learn more. ### Community Generative AI safety requires expertise and tooling, and we believe in the strength of the open community to accelerate its progress. We are active members of open consortiums, including the AI Alliance, Partnership on AI and MLCommons, actively contributing to safety standardization and transparency. We encourage the community to adopt taxonomies like the MLCommons Proof of Concept evaluation to facilitate collaboration and transparency on safety and content evaluations. Our Purple Llama tools are open sourced for the community to use and widely distributed across ecosystem partners including cloud service providers. We encourage community contributions to our Github repository. We also set up the Llama Impact Grants program to identify and support the most compelling applications of Meta’s Llama model for societal benefit across three categories: education, climate and open innovation. The 20 finalists from the hundreds of applications can be found here. Finally, we put in place a set of resources including an output reporting mechanism and bug bounty program to continuously improve the Llama technology with the help of the community. ## Ethical Considerations and Limitations The core values of Llama 3.1 are openness, inclusivity and helpfulness. It is meant to serve everyone, and to work for a wide range of use cases. It is thus designed to be accessible to people across many different backgrounds, experiences and perspectives. Llama 3.1 addresses users and their needs as they are, without insertion unnecessary judgment or normativity, while reflecting the understanding that even content that may appear problematic in some cases can serve valuable purposes in others. It respects the dignity and autonomy of all users, especially in terms of the values of free thought and expression that power innovation and progress. But Llama 3.1 is a new technology, and like any new technology, there are risks associated with its use. Testing conducted to date has not covered, nor could it cover, all scenarios. For these reasons, as with all LLMs, Llama 3.1’s potential outputs cannot be predicted in advance, and the model may in some instances produce inaccurate, biased or other objectionable responses to user prompts. Therefore, before deploying any applications of Llama 3.1 models, developers should perform safety testing and tuning tailored to their specific applications of the model. Please refer to available resources including our Responsible Use Guide, Trust and Safety solutions, and other resources to learn more about responsible development.",
+ "model_explanation_gemini": "A quantized version of Meta's Llama-3.1-8B-Instruct model optimized for efficient inference, supporting multilingual text generation.\n\n**Features:** \n- Multilingual support (English, German, French, Italian, Portuguese, Hindi, Spanish, Thai) \n- Text generation capability \n- Quantized GGUF format for efficiency \n- Derived from Meta's Llama 3.1 architecture \n- Community license with redistribution terms \n\n**Comparison:** \nThis model is a quantized variant"
+}
\ No newline at end of file
diff --git a/model_data_json/Qwen_QwQ-32B-GGUF.json b/model_data_json/Qwen_QwQ-32B-GGUF.json
new file mode 100644
index 0000000000000000000000000000000000000000..9f16815eddc38d56eb09fb417056b8c2c0a4874e
--- /dev/null
+++ b/model_data_json/Qwen_QwQ-32B-GGUF.json
@@ -0,0 +1,20 @@
+{
+ "model_id": "Qwen/QwQ-32B-GGUF",
+ "downloads": 68965,
+ "tags": [
+ "gguf",
+ "chat",
+ "text-generation",
+ "en",
+ "arxiv:2309.00071",
+ "arxiv:2412.15115",
+ "base_model:Qwen/QwQ-32B",
+ "base_model:quantized:Qwen/QwQ-32B",
+ "license:apache-2.0",
+ "endpoints_compatible",
+ "region:us",
+ "conversational"
+ ],
+ "description": "--- license: apache-2.0 license_link: language: - en pipeline_tag: text-generation base_model: Qwen/QwQ-32B tags: - chat --- # QwQ-32B-GGUF ## Introduction QwQ is the reasoning model of the Qwen series. Compared with conventional instruction-tuned models, QwQ, which is capable of thinking and reasoning, can achieve significantly enhanced performance in downstream tasks, especially hard problems. QwQ-32B is the medium-sized reasoning model, which is capable of achieving competitive performance against state-of-the-art reasoning models, e.g., DeepSeek-R1, o1-mini.
**This repo contains the QwQ 32B model in the GGUF Format**, which has the following features: - Type: Causal Language Models - Training Stage: Pretraining & Post-training (Supervised Finetuning and Reinforcement Learning) - Architecture: transformers with RoPE, SwiGLU, RMSNorm, and Attention QKV bias - Number of Parameters: 32.5B - Number of Paramaters (Non-Embedding): 31.0B - Number of Layers: 64 - Number of Attention Heads (GQA): 40 for Q and 8 for KV - Context Length: Full 131,072 tokens - Quantization: q4_K_M, q5_0, q5_K_M, q6_K, q8_0 **Note:** For the best experience, please review the usage guidelines before deploying QwQ models. You can try our demo or access QwQ models via QwenChat. For more details, please refer to our blog, GitHub, and Documentation. ## Requirements QwQ is based on Qwen2.5, whose code has been in the latest Hugging face . We advise you to use the latest version of . With , you will encounter the following error: Also check out our AWQ documentation for more usage guide. ## Quickstart heck out our llama.cpp documentation for more usage guide. We advise you to clone []( and install it following the official guide. We follow the latest version of llama.cpp. In the following demonstration, we assume that you are running commands under the repository . You can use the following commands for inference: ### Usage Guidelines To achieve optimal performance, we recommend the following settings: 1. **Enforce Thoughtful Output**: Ensure the model starts with \"\\\\n\" to prevent generating empty thinking content, which can degrade output quality. 2. **Sampling Parameters**: - Use Temperature=0.6, TopP=0.95, MinP=0 instead of Greedy decoding to avoid endless repetitions. - Use TopK between 20 and 40 to filter out rare token occurrences while maintaining the diversity of the generated output. - For supported frameworks, you can adjust the parameter between 0 and 2 to reduce endless repetitions. However, using a higher value may result in occasional language mixing and a slight decrease in performance. 3. **No Thinking Content in History**: In multi-turn conversations, the historical model output should only include the final output part and does not need to include the thinking content. This feature is already implemented in . 4. **Standardize Output Format**: We recommend using prompts to standardize model outputs when benchmarking. - **Math Problems**: Include \"Please reason step by step, and put your final answer within \\boxed{}.\" in the prompt. - **Multiple-Choice Questions**: Add the following JSON structure to the prompt to standardize responses: \"Please show your choice in the field with only the choice letter, e.g.,.\" in the prompt. 5. **Handle Long Inputs**: For inputs exceeding 32,768 tokens, enable YaRN to improve the model's ability to capture long-sequence information effectively. Currently, only vLLM supports YARN for length extrapolating. If you want to process sequences up to 131,072 tokens, please refer to non-GGUF models. 6. **Other References**: You can also consult Unsloth's Guide to see if their approach meets your needs. (Thanks to the Unsloth team!) ## Evaluation & Performance Detailed evaluation results are reported in this 📑 blog. For requirements on GPU memory and the respective throughput, see results here. ## Citation If you find our work helpful, feel free to give us a cite.",
+ "model_explanation_gemini": "A 32.5B-parameter reasoning-focused language model optimized for enhanced performance on complex tasks through structured thinking and step-by-step problem-solving, supporting long-context (131K tokens) and GGUF quantization. \n\n**Features:** \n- **Type:** Causal language model \n- **Architecture:** Transformers with RoPE, SwiGLU, RMSNorm, Attention QKV bias \n- **Training:** Pretraining + supervised finetuning + RL \n- **Context Length:**"
+}
\ No newline at end of file
diff --git a/model_data_json/Qwen_Qwen2-VL-2B.json b/model_data_json/Qwen_Qwen2-VL-2B.json
new file mode 100644
index 0000000000000000000000000000000000000000..46e8956afb91e668e80cf2cebe5721d22f022583
--- /dev/null
+++ b/model_data_json/Qwen_Qwen2-VL-2B.json
@@ -0,0 +1,21 @@
+{
+ "model_id": "Qwen/Qwen2-VL-2B",
+ "downloads": 77369,
+ "tags": [
+ "transformers",
+ "safetensors",
+ "qwen2_vl",
+ "image-text-to-text",
+ "multimodal",
+ "conversational",
+ "en",
+ "arxiv:2409.12191",
+ "arxiv:2308.12966",
+ "license:apache-2.0",
+ "text-generation-inference",
+ "endpoints_compatible",
+ "region:us"
+ ],
+ "description": "--- license: apache-2.0 language: - en pipeline_tag: image-text-to-text tags: - multimodal library_name: transformers --- # Qwen2-VL-2B ## Introduction We're excited to unveil **Qwen2-VL**, the latest iteration of our Qwen-VL model, representing nearly a year of innovation. > [!Important] > This is the base pretrained model of Qwen2-VL-2B without instruction tuning. ### What’s New in Qwen2-VL? #### Key Enhancements: * **SoTA understanding of images of various resolution & ratio**: Qwen2-VL achieves state-of-the-art performance on visual understanding benchmarks, including MathVista, DocVQA, RealWorldQA, MTVQA, etc. * **Understanding videos of 20min+**: Qwen2-VL can understand videos over 20 minutes for high-quality video-based question answering, dialog, content creation, etc. * **Agent that can operate your mobiles, robots, etc.**: with the abilities of complex reasoning and decision making, Qwen2-VL can be integrated with devices like mobile phones, robots, etc., for automatic operation based on visual environment and text instructions. * **Multilingual Support**: to serve global users, besides English and Chinese, Qwen2-VL now supports the understanding of texts in different languages inside images, including most European languages, Japanese, Korean, Arabic, Vietnamese, etc. #### Model Architecture Updates: * **Naive Dynamic Resolution**: Unlike before, Qwen2-VL can handle arbitrary image resolutions, mapping them into a dynamic number of visual tokens, offering a more human-like visual processing experience.
* **Multimodal Rotary Position Embedding (M-ROPE)**: Decomposes positional embedding into parts to capture 1D textual, 2D visual, and 3D video positional information, enhancing its multimodal processing capabilities.
We have three models with 2, 7 and 72 billion parameters. This repo contains the **pretrained** 2B Qwen2-VL model. For more information, visit our Blog and GitHub. ## Requirements The code of Qwen2-VL has been in the latest Hugging Face and we advise you to install the latest version with command , or you might encounter the following error: ## Citation If you find our work helpful, feel free to give us a cite.",
+ "model_explanation_gemini": "\"Qwen2-VL-2B is a 2-billion-parameter multimodal pretrained model excelling in visual understanding (images, videos), multilingual text recognition, and device integration for tasks like question answering, content creation, and robotic control.\"\n\n**Features**: \n- State-of-the-art image understanding (variable resolution/ratio) \n- Video comprehension (20+ minutes) \n- Multilingual text recognition in images (European, Asian, Arabic languages) \n- Device integration for robotic/m"
+}
\ No newline at end of file
diff --git a/model_data_json/Qwen_Qwen2-VL-72B.json b/model_data_json/Qwen_Qwen2-VL-72B.json
new file mode 100644
index 0000000000000000000000000000000000000000..4426684004063200d9d2fe13534dd14b492ac6c9
--- /dev/null
+++ b/model_data_json/Qwen_Qwen2-VL-72B.json
@@ -0,0 +1,21 @@
+{
+ "model_id": "Qwen/Qwen2-VL-72B",
+ "downloads": 77971,
+ "tags": [
+ "transformers",
+ "safetensors",
+ "qwen2_vl",
+ "image-text-to-text",
+ "multimodal",
+ "conversational",
+ "en",
+ "arxiv:2409.12191",
+ "arxiv:2308.12966",
+ "license:other",
+ "text-generation-inference",
+ "endpoints_compatible",
+ "region:us"
+ ],
+ "description": "--- license: other license_name: qwen license_link: language: - en pipeline_tag: image-text-to-text tags: - multimodal library_name: transformers --- # Qwen2-VL-72B ## Introduction We're excited to unveil **Qwen2-VL**, the latest iteration of our Qwen-VL model, representing nearly a year of innovation. > [!Important] > This is the base pretrained model of Qwen2-VL-72B without instruction tuning. ### What’s New in Qwen2-VL? #### Key Enhancements: * **SoTA understanding of images of various resolution & ratio**: Qwen2-VL achieves state-of-the-art performance on visual understanding benchmarks, including MathVista, DocVQA, RealWorldQA, MTVQA, etc. * **Understanding videos of 20min+**: Qwen2-VL can understand videos over 20 minutes for high-quality video-based question answering, dialog, content creation, etc. * **Agent that can operate your mobiles, robots, etc.**: with the abilities of complex reasoning and decision making, Qwen2-VL can be integrated with devices like mobile phones, robots, etc., for automatic operation based on visual environment and text instructions. * **Multilingual Support**: to serve global users, besides English and Chinese, Qwen2-VL now supports the understanding of texts in different languages inside images, including most European languages, Japanese, Korean, Arabic, Vietnamese, etc. #### Model Architecture Updates: * **Naive Dynamic Resolution**: Unlike before, Qwen2-VL can handle arbitrary image resolutions, mapping them into a dynamic number of visual tokens, offering a more human-like visual processing experience.
* **Multimodal Rotary Position Embedding (M-ROPE)**: Decomposes positional embedding into parts to capture 1D textual, 2D visual, and 3D video positional information, enhancing its multimodal processing capabilities.
We have three models with 2, 7 and 72 billion parameters. This repo contains the **pretrained** 72B Qwen2-VL model. For more information, visit our Blog and GitHub. ## Requirements The code of Qwen2-VL has been in the latest Hugging Face and we advise you to install the latest version with command , or you might encounter the following error: ## Citation If you find our work helpful, feel free to give us a cite.",
+ "model_explanation_gemini": "\"Qwen2-VL-72B is a multimodal, pretrained base model excelling in high-resolution image understanding, long-video comprehension (20+ minutes), multilingual text recognition in images, and device operation via visual reasoning, featuring dynamic resolution handling and enhanced multimodal processing.\"\n\n**Features:** \n1. State-of-the-art visual understanding (images, videos, documents). \n2. Supports arbitrary image resolutions via dynamic token mapping. \n3. Processes 20+ minute videos for QA/dialog"
+}
\ No newline at end of file
diff --git a/model_data_json/Qwen_Qwen2.5-32B-Instruct-AWQ.json b/model_data_json/Qwen_Qwen2.5-32B-Instruct-AWQ.json
new file mode 100644
index 0000000000000000000000000000000000000000..d0599589066ca826505b03685c4f643ce9ddb9ce
--- /dev/null
+++ b/model_data_json/Qwen_Qwen2.5-32B-Instruct-AWQ.json
@@ -0,0 +1,26 @@
+{
+ "model_id": "Qwen/Qwen2.5-32B-Instruct-AWQ",
+ "downloads": 70467,
+ "tags": [
+ "transformers",
+ "safetensors",
+ "qwen2",
+ "text-generation",
+ "chat",
+ "conversational",
+ "en",
+ "arxiv:2309.00071",
+ "arxiv:2407.10671",
+ "base_model:Qwen/Qwen2.5-32B-Instruct",
+ "base_model:quantized:Qwen/Qwen2.5-32B-Instruct",
+ "license:apache-2.0",
+ "autotrain_compatible",
+ "text-generation-inference",
+ "endpoints_compatible",
+ "4-bit",
+ "awq",
+ "region:us"
+ ],
+ "description": "--- base_model: Qwen/Qwen2.5-32B-Instruct language: - en library_name: transformers license: apache-2.0 license_link: pipeline_tag: text-generation tags: - chat --- # Qwen2.5-32B-Instruct-AWQ ## Introduction Qwen2.5 is the latest series of Qwen large language models. For Qwen2.5, we release a number of base language models and instruction-tuned language models ranging from 0.5 to 72 billion parameters. Qwen2.5 brings the following improvements upon Qwen2: - Significantly **more knowledge** and has greatly improved capabilities in **coding** and **mathematics**, thanks to our specialized expert models in these domains. - Significant improvements in **instruction following**, **generating long texts** (over 8K tokens), **understanding structured data** (e.g, tables), and **generating structured outputs** especially JSON. **More resilient to the diversity of system prompts**, enhancing role-play implementation and condition-setting for chatbots. - **Long-context Support** up to 128K tokens and can generate up to 8K tokens. - **Multilingual support** for over 29 languages, including Chinese, English, French, Spanish, Portuguese, German, Italian, Russian, Japanese, Korean, Vietnamese, Thai, Arabic, and more. **This repo contains the AWQ-quantized 4-bit instruction-tuned 32B Qwen2.5 model**, which has the following features: - Type: Causal Language Models - Training Stage: Pretraining & Post-training - Architecture: transformers with RoPE, SwiGLU, RMSNorm, and Attention QKV bias - Number of Parameters: 32.5B - Number of Paramaters (Non-Embedding): 31.0B - Number of Layers: 64 - Number of Attention Heads (GQA): 40 for Q and 8 for KV - Context Length: Full 131,072 tokens and generation 8192 tokens - Please refer to this section for detailed instructions on how to deploy Qwen2.5 for handling long texts. - Quantization: AWQ 4-bit For more details, please refer to our blog, GitHub, and Documentation. ## Requirements The code of Qwen2.5 has been in the latest Hugging face and we advise you to use the latest version of . With , you will encounter the following error: Also check out our AWQ documentation for more usage guide. ## Quickstart Here provides a code snippet with to show you how to load the tokenizer and model and how to generate contents. ### Processing Long Texts The current is set for context length up to 32,768 tokens. To handle extensive inputs exceeding 32,768 tokens, we utilize YaRN, a technique for enhancing model length extrapolation, ensuring optimal performance on lengthy texts. For supported frameworks, you could add the following to to enable YaRN: For deployment, we recommend using vLLM. Please refer to our Documentation for usage if you are not familar with vLLM. Presently, vLLM only supports static YARN, which means the scaling factor remains constant regardless of input length, **potentially impacting performance on shorter texts**. We advise adding the configuration only when processing long contexts is required. ## Evaluation & Performance Detailed evaluation results are reported in this 📑 blog. For quantized models, the benchmark results against the original bfloat16 models can be found here For requirements on GPU memory and the respective throughput, see results here. ## Citation If you find our work helpful, feel free to give us a cite.",
+ "model_explanation_gemini": "A 4-bit AWQ-quantized, 32.5B-parameter instruction-tuned causal language model optimized for text generation, featuring enhanced coding, mathematics, multilingual support (29+ languages), structured data handling, and long-context processing (128K tokens). \n\n**Features:** \n- **Core Task:** Text generation (chat, instruction following) \n- **Key Improvements:** Better coding/math, structured output (JSON), long-text generation (8K tokens), multilingual support"
+}
\ No newline at end of file
diff --git a/model_data_json/Qwen_Qwen2.5-32B.json b/model_data_json/Qwen_Qwen2.5-32B.json
new file mode 100644
index 0000000000000000000000000000000000000000..84f0774931278a3b386b9272aeb4e3345f8fa56a
--- /dev/null
+++ b/model_data_json/Qwen_Qwen2.5-32B.json
@@ -0,0 +1,16 @@
+{
+ "model_id": "Qwen/Qwen2.5-32B",
+ "downloads": 76454,
+ "tags": [
+ "safetensors",
+ "qwen2",
+ "text-generation",
+ "conversational",
+ "en",
+ "arxiv:2407.10671",
+ "license:apache-2.0",
+ "region:us"
+ ],
+ "description": "--- license: apache-2.0 license_link: language: - en pipeline_tag: text-generation --- # Qwen2.5-32B ## Introduction Qwen2.5 is the latest series of Qwen large language models. For Qwen2.5, we release a number of base language models and instruction-tuned language models ranging from 0.5 to 72 billion parameters. Qwen2.5 brings the following improvements upon Qwen2: - Significantly **more knowledge** and has greatly improved capabilities in **coding** and **mathematics**, thanks to our specialized expert models in these domains. - Significant improvements in **instruction following**, **generating long texts** (over 8K tokens), **understanding structured data** (e.g, tables), and **generating structured outputs** especially JSON. **More resilient to the diversity of system prompts**, enhancing role-play implementation and condition-setting for chatbots. - **Long-context Support** up to 128K tokens and can generate up to 8K tokens. - **Multilingual support** for over 29 languages, including Chinese, English, French, Spanish, Portuguese, German, Italian, Russian, Japanese, Korean, Vietnamese, Thai, Arabic, and more. **This repo contains the base 32B Qwen2.5 model**, which has the following features: - Type: Causal Language Models - Training Stage: Pretraining - Architecture: transformers with RoPE, SwiGLU, RMSNorm, and Attention QKV bias - Number of Parameters: 32.5B - Number of Paramaters (Non-Embedding): 31.0B - Number of Layers: 64 - Number of Attention Heads (GQA): 40 for Q and 8 for KV - Context Length: 131,072 tokens **We do not recommend using base language models for conversations.** Instead, you can apply post-training, e.g., SFT, RLHF, continued pretraining, etc., on this model. For more details, please refer to our blog, GitHub, and Documentation. ## Requirements The code of Qwen2.5 has been in the latest Hugging face and we advise you to use the latest version of . With , you will encounter the following error: ## Evaluation & Performance Detailed evaluation results are reported in this 📑 blog. For requirements on GPU memory and the respective throughput, see results here. ## Citation If you find our work helpful, feel free to give us a cite.",
+ "model_explanation_gemini": "\"Qwen2.5-32B is a 32.5B-parameter causal language model optimized for text generation, featuring enhanced coding, mathematics, multilingual support (29+ languages), long-context handling (128K tokens), structured data understanding, and improved instruction following compared to Qwen2.\"\n\n### **Model Features:** \n- **Type:** Causal Language Model \n- **Architecture:** Transformers with RoPE, SwiGLU, RMSNorm, GQA (40"
+}
\ No newline at end of file
diff --git a/model_data_json/Qwen_Qwen2.5-Math-7B-Instruct.json b/model_data_json/Qwen_Qwen2.5-Math-7B-Instruct.json
new file mode 100644
index 0000000000000000000000000000000000000000..1d56f51391de66bd1b4e3ef42b56ce9387cffa60
--- /dev/null
+++ b/model_data_json/Qwen_Qwen2.5-Math-7B-Instruct.json
@@ -0,0 +1,23 @@
+{
+ "model_id": "Qwen/Qwen2.5-Math-7B-Instruct",
+ "downloads": 71240,
+ "tags": [
+ "transformers",
+ "safetensors",
+ "qwen2",
+ "text-generation",
+ "chat",
+ "conversational",
+ "en",
+ "arxiv:2409.12122",
+ "base_model:Qwen/Qwen2.5-Math-7B",
+ "base_model:finetune:Qwen/Qwen2.5-Math-7B",
+ "license:apache-2.0",
+ "autotrain_compatible",
+ "text-generation-inference",
+ "endpoints_compatible",
+ "region:us"
+ ],
+ "description": "--- base_model: Qwen/Qwen2.5-Math-7B language: - en pipeline_tag: text-generation tags: - chat library_name: transformers license: apache-2.0 license_link: --- # Qwen2.5-Math-7B-Instruct > [!Warning] >
> > 🚨 Qwen2.5-Math mainly supports solving English and Chinese math problems through CoT and TIR. We do not recommend using this series of models for other tasks. > >
## Introduction In August 2024, we released the first series of mathematical LLMs - Qwen2-Math - of our Qwen family. A month later, we have upgraded it and open-sourced **Qwen2.5-Math** series, including base models **Qwen2.5-Math-1.5B/7B/72B**, instruction-tuned models **Qwen2.5-Math-1.5B/7B/72B-Instruct**, and mathematical reward model **Qwen2.5-Math-RM-72B**. Unlike Qwen2-Math series which only supports using Chain-of-Thught (CoT) to solve English math problems, Qwen2.5-Math series is expanded to support using both CoT and Tool-integrated Reasoning (TIR) to solve math problems in both Chinese and English. The Qwen2.5-Math series models have achieved significant performance improvements compared to the Qwen2-Math series models on the Chinese and English mathematics benchmarks with CoT.  and Tool-integrated Reasoning (TIR), with improved accuracy over its predecessor Qwen2-Math. \n\n**Features:** \n1. Supports English and Chinese math problem-solving. \n2. Utilizes CoT and TIR for enhanced reasoning and computational precision. \n3. Outperforms Qwen2-Math on Chinese/English benchmarks. \n4. Available in 1.5B,"
+}
\ No newline at end of file
diff --git a/model_data_json/Qwen_Qwen2.5-VL-7B-Instruct-AWQ.json b/model_data_json/Qwen_Qwen2.5-VL-7B-Instruct-AWQ.json
new file mode 100644
index 0000000000000000000000000000000000000000..15e96dff41950772d9a48ac8fc9a2a76f304ce72
--- /dev/null
+++ b/model_data_json/Qwen_Qwen2.5-VL-7B-Instruct-AWQ.json
@@ -0,0 +1,26 @@
+{
+ "model_id": "Qwen/Qwen2.5-VL-7B-Instruct-AWQ",
+ "downloads": 78495,
+ "tags": [
+ "transformers",
+ "safetensors",
+ "qwen2_5_vl",
+ "image-text-to-text",
+ "multimodal",
+ "conversational",
+ "en",
+ "arxiv:2309.00071",
+ "arxiv:2409.12191",
+ "arxiv:2308.12966",
+ "base_model:Qwen/Qwen2.5-VL-7B-Instruct",
+ "base_model:quantized:Qwen/Qwen2.5-VL-7B-Instruct",
+ "license:apache-2.0",
+ "text-generation-inference",
+ "endpoints_compatible",
+ "4-bit",
+ "awq",
+ "region:us"
+ ],
+ "description": "--- license: apache-2.0 language: - en pipeline_tag: image-text-to-text tags: - multimodal library_name: transformers base_model: - Qwen/Qwen2.5-VL-7B-Instruct --- # Qwen2.5-VL-7B-Instruct-AWQ ## Introduction In the past five months since Qwen2-VL’s release, numerous developers have built new models on the Qwen2-VL vision-language models, providing us with valuable feedback. During this period, we focused on building more useful vision-language models. Today, we are excited to introduce the latest addition to the Qwen family: Qwen2.5-VL. #### Key Enhancements: * **Understand things visually**: Qwen2.5-VL is not only proficient in recognizing common objects such as flowers, birds, fish, and insects, but it is highly capable of analyzing texts, charts, icons, graphics, and layouts within images. * **Being agentic**: Qwen2.5-VL directly plays as a visual agent that can reason and dynamically direct tools, which is capable of computer use and phone use. * **Understanding long videos and capturing events**: Qwen2.5-VL can comprehend videos of over 1 hour, and this time it has a new ability of cpaturing event by pinpointing the relevant video segments. * **Capable of visual localization in different formats**: Qwen2.5-VL can accurately localize objects in an image by generating bounding boxes or points, and it can provide stable JSON outputs for coordinates and attributes. * **Generating structured outputs**: for data like scans of invoices, forms, tables, etc. Qwen2.5-VL supports structured outputs of their contents, benefiting usages in finance, commerce, etc. #### Model Architecture Updates: * **Dynamic Resolution and Frame Rate Training for Video Understanding**: We extend dynamic resolution to the temporal dimension by adopting dynamic FPS sampling, enabling the model to comprehend videos at various sampling rates. Accordingly, we update mRoPE in the time dimension with IDs and absolute time alignment, enabling the model to learn temporal sequence and speed, and ultimately acquire the ability to pinpoint specific moments.
TheBloke's LLM work is generously supported by a grant from
# Llama 2 7B Chat - GGUF - Model creator: Meta Llama 2 - Original model: Llama 2 7B Chat ## Description This repo contains GGUF format model files for Meta Llama 2's Llama 2 7B Chat. ### About GGUF GGUF is a new format introduced by the llama.cpp team on August 21st 2023. It is a replacement for GGML, which is no longer supported by llama.cpp. GGUF offers numerous advantages over GGML, such as better tokenisation, and support for special tokens. It is also supports metadata, and is designed to be extensible. Here is an incomplate list of clients and libraries that are known to support GGUF: * llama.cpp. The source project for GGUF. Offers a CLI and a server option. * text-generation-webui, the most widely used web UI, with many features and powerful extensions. Supports GPU acceleration. * KoboldCpp, a fully featured web UI, with GPU accel across all platforms and GPU architectures. Especially good for story telling. * LM Studio, an easy-to-use and powerful local GUI for Windows and macOS (Silicon), with GPU acceleration. * LoLLMS Web UI, a great web UI with many interesting and unique features, including a full model library for easy model selection. * Faraday.dev, an attractive and easy to use character-based chat GUI for Windows and macOS (both Silicon and Intel), with GPU acceleration. * ctransformers, a Python library with GPU accel, LangChain support, and OpenAI-compatible AI server. * llama-cpp-python, a Python library with GPU accel, LangChain support, and OpenAI-compatible API server. * candle, a Rust ML framework with a focus on performance, including GPU support, and ease of use. ## Repositories available * AWQ model(s) for GPU inference. * GPTQ models for GPU inference, with multiple quantisation parameter options. * 2, 3, 4, 5, 6 and 8-bit GGUF models for CPU+GPU inference * Meta Llama 2's original unquantised fp16 model in pytorch format, for GPU inference and for further conversions ## Prompt template: Llama-2-Chat ## Compatibility These quantised GGUFv2 files are compatible with llama.cpp from August 27th onwards, as of commit d0cee0d36d5be95a0d9088b674dbb27354107221 They are also compatible with many third party UIs and libraries - please see the list at the top of this README. ## Explanation of quantisation methods Click to see details The new methods available are: * GGML_TYPE_Q2_K - \"type-1\" 2-bit quantization in super-blocks containing 16 blocks, each block having 16 weight. Block scales and mins are quantized with 4 bits. This ends up effectively using 2.5625 bits per weight (bpw) * GGML_TYPE_Q3_K - \"type-0\" 3-bit quantization in super-blocks containing 16 blocks, each block having 16 weights. Scales are quantized with 6 bits. This end up using 3.4375 bpw. * GGML_TYPE_Q4_K - \"type-1\" 4-bit quantization in super-blocks containing 8 blocks, each block having 32 weights. Scales and mins are quantized with 6 bits. This ends up using 4.5 bpw. * GGML_TYPE_Q5_K - \"type-1\" 5-bit quantization. Same super-block structure as GGML_TYPE_Q4_K resulting in 5.5 bpw * GGML_TYPE_Q6_K - \"type-0\" 6-bit quantization. Super-blocks with 16 blocks, each block having 16 weights. Scales are quantized with 8 bits. This ends up using 6.5625 bpw Refer to the Provided Files table below to see what files use which methods, and how. ## Provided files | Name | Quant method | Bits | Size | Max RAM required | Use case | | ---- | ---- | ---- | ---- | ---- | ----- | | llama-2-7b-chat.Q2_K.gguf | Q2_K | 2 | 2.83 GB| 5.33 GB | smallest, significant quality loss - not recommended for most purposes | | llama-2-7b-chat.Q3_K_S.gguf | Q3_K_S | 3 | 2.95 GB| 5.45 GB | very small, high quality loss | | llama-2-7b-chat.Q3_K_M.gguf | Q3_K_M | 3 | 3.30 GB| 5.80 GB | very small, high quality loss | | llama-2-7b-chat.Q3_K_L.gguf | Q3_K_L | 3 | 3.60 GB| 6.10 GB | small, substantial quality loss | | llama-2-7b-chat.Q4_0.gguf | Q4_0 | 4 | 3.83 GB| 6.33 GB | legacy; small, very high quality loss - prefer using Q3_K_M | | llama-2-7b-chat.Q4_K_S.gguf | Q4_K_S | 4 | 3.86 GB| 6.36 GB | small, greater quality loss | | llama-2-7b-chat.Q4_K_M.gguf | Q4_K_M | 4 | 4.08 GB| 6.58 GB | medium, balanced quality - recommended | | llama-2-7b-chat.Q5_0.gguf | Q5_0 | 5 | 4.65 GB| 7.15 GB | legacy; medium, balanced quality - prefer using Q4_K_M | | llama-2-7b-chat.Q5_K_S.gguf | Q5_K_S | 5 | 4.65 GB| 7.15 GB | large, low quality loss - recommended | | llama-2-7b-chat.Q5_K_M.gguf | Q5_K_M | 5 | 4.78 GB| 7.28 GB | large, very low quality loss - recommended | | llama-2-7b-chat.Q6_K.gguf | Q6_K | 6 | 5.53 GB| 8.03 GB | very large, extremely low quality loss | | llama-2-7b-chat.Q8_0.gguf | Q8_0 | 8 | 7.16 GB| 9.66 GB | very large, extremely low quality loss - not recommended | **Note**: the above RAM figures assume no GPU offloading. If layers are offloaded to the GPU, this will reduce RAM usage and use VRAM instead. ## How to download GGUF files **Note for manual downloaders:** You almost never want to clone the entire repo! Multiple different quantisation formats are provided, and most users only want to pick and download a single file. The following clients/libraries will automatically download models for you, providing a list of available models to choose from: - LM Studio - LoLLMS Web UI - Faraday.dev ### In Under Download Model, you can enter the model repo: TheBloke/Llama-2-7b-Chat-GGUF and below it, a specific filename to download, such as: llama-2-7b-chat.q4_K_M.gguf. Then click Download. ### On the command line, including multiple files at once I recommend using the Python library: Then you can download any individual model file to the current directory, at high speed, with a command like this: More advanced huggingface-cli download usage You can also download multiple files at once with a pattern: For more documentation on downloading with , please see: HF -> Hub Python Library -> Download files -> Download from the CLI. To accelerate downloads on fast connections (1Gbit/s or higher), install : And set environment variable to : Windows CLI users: Use before running the download command. ## Example command Make sure you are using from commit d0cee0d36d5be95a0d9088b674dbb27354107221 or later. Change to the number of layers to offload to GPU. Remove it if you don't have GPU acceleration. Change to the desired sequence length. For extended sequence models - eg 8K, 16K, 32K - the necessary RoPE scaling parameters are read from the GGUF file and set by llama.cpp automatically. If you want to have a chat-style conversation, replace the argument with For other parameters and how to use them, please refer to the llama.cpp documentation ## How to run in Further instructions here: text-generation-webui/docs/llama.cpp.md. ## How to run from Python code You can use GGUF models from Python using the llama-cpp-python or ctransformers libraries. ### How to load this model from Python using ctransformers #### First install the package #### Simple example code to load one of these GGUF models ## How to use with LangChain Here's guides on using llama-cpp-python or ctransformers with LangChain: * LangChain + llama-cpp-python * LangChain + ctransformers ## Discord For further support, and discussions on these models and AI in general, join us at: TheBloke AI's Discord server ## Thanks, and how to contribute Thanks to the chirper.ai team! Thanks to Clay from gpus.llm-utils.org! I've had a lot of people ask if they can contribute. I enjoy providing models and helping people, and would love to be able to spend even more time doing it, as well as expanding into new projects like fine tuning/training. If you're able and willing to contribute it will be most gratefully received and will help me to keep providing more models, and to start work on new AI projects. Donaters will get priority support on any and all AI/LLM/model questions and requests, access to a private Discord room, plus other benefits. * Patreon: * Ko-Fi: **Special thanks to**: Aemon Algiz. **Patreon special mentions**: Alicia Loh, Stephen Murray, K, Ajan Kanaga, RoA, Magnesian, Deo Leter, Olakabola, Eugene Pentland, zynix, Deep Realms, Raymond Fosdick, Elijah Stavena, Iucharbius, Erik Bjäreholt, Luis Javier Navarrete Lozano, Nicholas, theTransient, John Detwiler, alfie_i, knownsqashed, Mano Prime, Willem Michiel, Enrico Ros, LangChain4j, OG, Michael Dempsey, Pierre Kircher, Pedro Madruga, James Bentley, Thomas Belote, Luke @flexchar, Leonard Tan, Johann-Peter Hartmann, Illia Dulskyi, Fen Risland, Chadd, S_X, Jeff Scroggin, Ken Nordquist, Sean Connelly, Artur Olbinski, Swaroop Kallakuri, Jack West, Ai Maven, David Ziegler, Russ Johnson, transmissions 11, John Villwock, Alps Aficionado, Clay Pascal, Viktor Bowallius, Subspace Studios, Rainer Wilmers, Trenton Dambrowitz, vamX, Michael Levine, 준교 김, Brandon Frisco, Kalila, Trailburnt, Randy H, Talal Aujan, Nathan Dryer, Vadim, 阿明, ReadyPlayerEmma, Tiffany J. Kim, George Stoitzev, Spencer Kim, Jerry Meng, Gabriel Tamborski, Cory Kujawski, Jeffrey Morgan, Spiking Neurons AB, Edmond Seymore, Alexandros Triantafyllidis, Lone Striker, Cap'n Zoog, Nikolai Manek, danny, ya boyyy, Derek Yates, usrbinkat, Mandus, TL, Nathan LeClaire, subjectnull, Imad Khwaja, webtim, Raven Klaugh, Asp the Wyvern, Gabriel Puliatti, Caitlyn Gatomon, Joseph William Delisle, Jonathan Leane, Luke Pendergrass, SuperWojo, Sebastain Graf, Will Dee, Fred von Graf, Andrey, Dan Guido, Daniel P. Andersen, Nitin Borwankar, Elle, Vitor Caleffi, biorpg, jjj, NimbleBox.ai, Pieter, Matthew Berman, terasurfer, Michael Davis, Alex, Stanislav Ovsiannikov Thank you to all my generous patrons and donaters! And thank you again to a16z for their generous grant. # Original model card: Meta Llama 2's Llama 2 7B Chat # **Llama 2** Llama 2 is a collection of pretrained and fine-tuned generative text models ranging in scale from 7 billion to 70 billion parameters. This is the repository for the 7B fine-tuned model, optimized for dialogue use cases and converted for the Hugging Face Transformers format. Links to other models can be found in the index at the bottom. ## Model Details *Note: Use of this model is governed by the Meta license. In order to download the model weights and tokenizer, please visit the website and accept our License before requesting access here.* Meta developed and publicly released the Llama 2 family of large language models (LLMs), a collection of pretrained and fine-tuned generative text models ranging in scale from 7 billion to 70 billion parameters. Our fine-tuned LLMs, called Llama-2-Chat, are optimized for dialogue use cases. Llama-2-Chat models outperform open-source chat models on most benchmarks we tested, and in our human evaluations for helpfulness and safety, are on par with some popular closed-source models like ChatGPT and PaLM. **Model Developers** Meta **Variations** Llama 2 comes in a range of parameter sizes — 7B, 13B, and 70B — as well as pretrained and fine-tuned variations. **Input** Models input text only. **Output** Models generate text only. **Model Architecture** Llama 2 is an auto-regressive language model that uses an optimized transformer architecture. The tuned versions use supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) to align to human preferences for helpfulness and safety. ||Training Data|Params|Content Length|GQA|Tokens|LR| |---|---|---|---|---|---|---| |Llama 2|*A new mix of publicly available online data*|7B|4k|✗|2.0T|3.0 x 10-4| |Llama 2|*A new mix of publicly available online data*|13B|4k|✗|2.0T|3.0 x 10-4| |Llama 2|*A new mix of publicly available online data*|70B|4k|✔|2.0T|1.5 x 10-4| *Llama 2 family of models.* Token counts refer to pretraining data only. All models are trained with a global batch-size of 4M tokens. Bigger models - 70B -- use Grouped-Query Attention (GQA) for improved inference scalability. **Model Dates** Llama 2 was trained between January 2023 and July 2023. **Status** This is a static model trained on an offline dataset. Future versions of the tuned models will be released as we improve model safety with community feedback. **License** A custom commercial license is available at: **Research Paper** \"Llama-2: Open Foundation and Fine-tuned Chat Models\" ## Intended Use **Intended Use Cases** Llama 2 is intended for commercial and research use in English. Tuned models are intended for assistant-like chat, whereas pretrained models can be adapted for a variety of natural language generation tasks. To get the expected features and performance for the chat versions, a specific formatting needs to be followed, including the and tags, and tokens, and the whitespaces and breaklines in between (we recommend calling on inputs to avoid double-spaces). See our reference code in github for details: []( **Out-of-scope Uses** Use in any manner that violates applicable laws or regulations (including trade compliance laws).Use in languages other than English. Use in any other way that is prohibited by the Acceptable Use Policy and Licensing Agreement for Llama 2. ## Hardware and Software **Training Factors** We used custom training libraries, Meta's Research Super Cluster, and production clusters for pretraining. Fine-tuning, annotation, and evaluation were also performed on third-party cloud compute. **Carbon Footprint** Pretraining utilized a cumulative 3.3M GPU hours of computation on hardware of type A100-80GB (TDP of 350-400W). Estimated total emissions were 539 tCO2eq, 100% of which were offset by Meta’s sustainability program. ||Time (GPU hours)|Power Consumption (W)|Carbon Emitted(tCO2eq)| |---|---|---|---| |Llama 2 7B|184320|400|31.22| |Llama 2 13B|368640|400|62.44| |Llama 2 70B|1720320|400|291.42| |Total|3311616||539.00| **CO2 emissions during pretraining.** Time: total GPU time required for training each model. Power Consumption: peak power capacity per GPU device for the GPUs used adjusted for power usage efficiency. 100% of the emissions are directly offset by Meta's sustainability program, and because we are openly releasing these models, the pretraining costs do not need to be incurred by others. ## Training Data **Overview** Llama 2 was pretrained on 2 trillion tokens of data from publicly available sources. The fine-tuning data includes publicly available instruction datasets, as well as over one million new human-annotated examples. Neither the pretraining nor the fine-tuning datasets include Meta user data. **Data Freshness** The pretraining data has a cutoff of September 2022, but some tuning data is more recent, up to July 2023. ## Evaluation Results In this section, we report the results for the Llama 1 and Llama 2 models on standard academic benchmarks.For all the evaluations, we use our internal evaluations library. |Model|Size|Code|Commonsense Reasoning|World Knowledge|Reading Comprehension|Math|MMLU|BBH|AGI Eval| |---|---|---|---|---|---|---|---|---|---| |Llama 1|7B|14.1|60.8|46.2|58.5|6.95|35.1|30.3|23.9| |Llama 1|13B|18.9|66.1|52.6|62.3|10.9|46.9|37.0|33.9| |Llama 1|33B|26.0|70.0|58.4|67.6|21.4|57.8|39.8|41.7| |Llama 1|65B|30.7|70.7|60.5|68.6|30.8|63.4|43.5|47.6| |Llama 2|7B|16.8|63.9|48.9|61.3|14.6|45.3|32.6|29.3| |Llama 2|13B|24.5|66.9|55.4|65.8|28.7|54.8|39.4|39.1| |Llama 2|70B|**37.5**|**71.9**|**63.6**|**69.4**|**35.2**|**68.9**|**51.2**|**54.2**| **Overall performance on grouped academic benchmarks.** *Code:* We report the average pass@1 scores of our models on HumanEval and MBPP. *Commonsense Reasoning:* We report the average of PIQA, SIQA, HellaSwag, WinoGrande, ARC easy and challenge, OpenBookQA, and CommonsenseQA. We report 7-shot results for CommonSenseQA and 0-shot results for all other benchmarks. *World Knowledge:* We evaluate the 5-shot performance on NaturalQuestions and TriviaQA and report the average. *Reading Comprehension:* For reading comprehension, we report the 0-shot average on SQuAD, QuAC, and BoolQ. *MATH:* We report the average of the GSM8K (8 shot) and MATH (4 shot) benchmarks at top 1. |||TruthfulQA|Toxigen| |---|---|---|---| |Llama 1|7B|27.42|23.00| |Llama 1|13B|41.74|23.08| |Llama 1|33B|44.19|22.57| |Llama 1|65B|48.71|21.77| |Llama 2|7B|33.29|**21.25**| |Llama 2|13B|41.86|26.10| |Llama 2|70B|**50.18**|24.60| **Evaluation of pretrained LLMs on automatic safety benchmarks.** For TruthfulQA, we present the percentage of generations that are both truthful and informative (the higher the better). For ToxiGen, we present the percentage of toxic generations (the smaller the better). |||TruthfulQA|Toxigen| |---|---|---|---| |Llama-2-Chat|7B|57.04|**0.00**| |Llama-2-Chat|13B|62.18|**0.00**| |Llama-2-Chat|70B|**64.14**|0.01| **Evaluation of fine-tuned LLMs on different safety datasets.** Same metric definitions as above. ## Ethical Considerations and Limitations Llama 2 is a new technology that carries risks with use. Testing conducted to date has been in English, and has not covered, nor could it cover all scenarios. For these reasons, as with all LLMs, Llama 2’s potential outputs cannot be predicted in advance, and the model may in some instances produce inaccurate, biased or other objectionable responses to user prompts. Therefore, before deploying any applications of Llama 2, developers should perform safety testing and tuning tailored to their specific applications of the model. Please see the Responsible Use Guide available at ## Reporting Issues Please report any software “bug,” or other problems with the models through one of the following means: - Reporting issues with the model: github.com/facebookresearch/llama - Reporting problematic content generated by the model: developers.facebook.com/llama_output_feedback - Reporting bugs and security concerns: facebook.com/whitehat/info ## Llama Model Index |Model|Llama2|Llama2-hf|Llama2-chat|Llama2-chat-hf| |---|---|---|---|---| |7B| Link | Link | Link | Link| |13B| Link | Link | Link | Link| |70B| Link | Link | Link | Link| ",
+ "model_explanation_gemini": "A 7B-parameter Llama 2-based chat model in GGUF format, optimized for helpful, safe, and honest text generation while avoiding harmful or unethical content.\n\n**Features:** \n- Fine-tuned for chat/assistant tasks \n- Enforces safety and ethical guidelines in responses \n- Quantized in GGUF format for CPU/GPU efficiency \n- Supports multiple inference clients (llama.cpp, KoboldCpp, etc.) \n\n**Comparison:** \nThis GGUF-quant"
+}
\ No newline at end of file
diff --git a/model_data_json/TheBloke_Mixtral-8x7B-Instruct-v0.1-AWQ.json b/model_data_json/TheBloke_Mixtral-8x7B-Instruct-v0.1-AWQ.json
new file mode 100644
index 0000000000000000000000000000000000000000..bf3c4e901676d6ceb2d7d1da53f0a83e1ad5ff9f
--- /dev/null
+++ b/model_data_json/TheBloke_Mixtral-8x7B-Instruct-v0.1-AWQ.json
@@ -0,0 +1,26 @@
+{
+ "model_id": "TheBloke/Mixtral-8x7B-Instruct-v0.1-AWQ",
+ "downloads": 75286,
+ "tags": [
+ "transformers",
+ "safetensors",
+ "mixtral",
+ "text-generation",
+ "conversational",
+ "fr",
+ "it",
+ "de",
+ "es",
+ "en",
+ "base_model:mistralai/Mixtral-8x7B-Instruct-v0.1",
+ "base_model:quantized:mistralai/Mixtral-8x7B-Instruct-v0.1",
+ "license:apache-2.0",
+ "autotrain_compatible",
+ "text-generation-inference",
+ "4-bit",
+ "awq",
+ "region:us"
+ ],
+ "description": "--- base_model: mistralai/Mixtral-8x7B-Instruct-v0.1 inference: false language: - fr - it - de - es - en license: apache-2.0 model_creator: Mistral AI_ model_name: Mixtral 8X7B Instruct v0.1 model_type: mixtral prompt_template: '[INST] {prompt} [/INST] ' quantized_by: TheBloke widget: - output: text: 'Arr, shiver me timbers! Ye have a llama on yer lawn, ye say? Well, that be a new one for me! Here''s what I''d suggest, arr: 1. Firstly, ensure yer safety. Llamas may look gentle, but they can be protective if they feel threatened. 2. Try to make the area less appealing to the llama. Remove any food sources or water that might be attracting it. 3. Contact local animal control or a wildlife rescue organization. They be the experts and can provide humane ways to remove the llama from yer property. 4. If ye have any experience with animals, you could try to gently herd the llama towards a nearby field or open space. But be careful, arr! Remember, arr, it be important to treat the llama with respect and care. It be a creature just trying to survive, like the rest of us.' text: '[INST] You are a pirate chatbot who always responds with Arr and pirate speak! There''s a llama on my lawn, how can I get rid of him? [/INST]' ---
TheBloke's LLM work is generously supported by a grant from
# Mixtral 8X7B Instruct v0.1 - AWQ - Model creator: Mistral AI_ - Original model: Mixtral 8X7B Instruct v0.1 ## Description This repo contains AWQ model files for Mistral AI_'s Mixtral 8X7B Instruct v0.1. ### About AWQ AWQ is an efficient, accurate and blazing-fast low-bit weight quantization method, currently supporting 4-bit quantization. Compared to GPTQ, it offers faster Transformers-based inference with equivalent or better quality compared to the most commonly used GPTQ settings. AWQ models are currently supported on Linux and Windows, with NVidia GPUs only. macOS users: please use GGUF models instead. It is supported by: - Text Generation Webui - using Loader: AutoAWQ - vLLM - version 0.2.2 or later for support for all model types. - Hugging Face Text Generation Inference (TGI) - Transformers version 4.35.0 and later, from any code or client that supports Transformers - AutoAWQ - for use from Python code ## Repositories available * AWQ model(s) for GPU inference. * GPTQ models for GPU inference, with multiple quantisation parameter options. * 2, 3, 4, 5, 6 and 8-bit GGUF models for CPU+GPU inference * Mistral AI_'s original unquantised fp16 model in pytorch format, for GPU inference and for further conversions ## Prompt template: Mistral ## Provided files, and AWQ parameters I currently release 128g GEMM models only. The addition of group_size 32 models, and GEMV kernel models, is being actively considered. Models are released as sharded safetensors files. | Branch | Bits | GS | AWQ Dataset | Seq Len | Size | | ------ | ---- | -- | ----------- | ------- | ---- | | main | 4 | 128 | VMware Open Instruct | 8192 | 24.65 GB ## How to easily download and use this model in text-generation-webui Please make sure you're using the latest version of text-generation-webui. It is strongly recommended to use the text-generation-webui one-click-installers unless you're sure you know how to make a manual install. 1. Click the **Model tab**. 2. Under **Download custom model or LoRA**, enter . 3. Click **Download**. 4. The model will start downloading. Once it's finished it will say \"Done\". 5. In the top left, click the refresh icon next to **Model**. 6. In the **Model** dropdown, choose the model you just downloaded: 7. Select **Loader: AutoAWQ**. 8. Click Load, and the model will load and is now ready for use. 9. If you want any custom settings, set them and then click **Save settings for this model** followed by **Reload the Model** in the top right. 10. Once you're ready, click the **Text Generation** tab and enter a prompt to get started! ## Multi-user inference server: vLLM Documentation on installing and using vLLM can be found here. - Please ensure you are using vLLM version 0.2 or later. - When using vLLM as a server, pass the parameter. For example: - When using vLLM from Python code, again set . For example: ## Multi-user inference server: Hugging Face Text Generation Inference (TGI) Use TGI version 1.1.0 or later. The official Docker container is: Example Docker parameters: Example Python code for interfacing with TGI (requires huggingface-hub 0.17.0 or later): ## Inference from Python code using Transformers ### Install the necessary packages - Requires: Transformers 4.35.0 or later. - Requires: AutoAWQ 0.1.6 or later. Note that if you are using PyTorch 2.0.1, the above AutoAWQ command will automatically upgrade you to PyTorch 2.1.0. If you are using CUDA 11.8 and wish to continue using PyTorch 2.0.1, instead run this command: If you have problems installing AutoAWQ using the pre-built wheels, install it from source instead: ### Transformers example code (requires Transformers 4.35.0 and later) ## Compatibility The files provided are tested to work with: - text-generation-webui using . - vLLM version 0.2.0 and later. - Hugging Face Text Generation Inference (TGI) version 1.1.0 and later. - Transformers version 4.35.0 and later. - AutoAWQ version 0.1.1 and later. ## Discord For further support, and discussions on these models and AI in general, join us at: TheBloke AI's Discord server ## Thanks, and how to contribute Thanks to the chirper.ai team! Thanks to Clay from gpus.llm-utils.org! I've had a lot of people ask if they can contribute. I enjoy providing models and helping people, and would love to be able to spend even more time doing it, as well as expanding into new projects like fine tuning/training. If you're able and willing to contribute it will be most gratefully received and will help me to keep providing more models, and to start work on new AI projects. Donaters will get priority support on any and all AI/LLM/model questions and requests, access to a private Discord room, plus other benefits. * Patreon: * Ko-Fi: **Special thanks to**: Aemon Algiz. **Patreon special mentions**: Michael Levine, 阿明, Trailburnt, Nikolai Manek, John Detwiler, Randy H, Will Dee, Sebastain Graf, NimbleBox.ai, Eugene Pentland, Emad Mostaque, Ai Maven, Jim Angel, Jeff Scroggin, Michael Davis, Manuel Alberto Morcote, Stephen Murray, Robert, Justin Joy, Luke @flexchar, Brandon Frisco, Elijah Stavena, S_X, Dan Guido, Undi ., Komninos Chatzipapas, Shadi, theTransient, Lone Striker, Raven Klaugh, jjj, Cap'n Zoog, Michel-Marie MAUDET (LINAGORA), Matthew Berman, David, Fen Risland, Omer Bin Jawed, Luke Pendergrass, Kalila, OG, Erik Bjäreholt, Rooh Singh, Joseph William Delisle, Dan Lewis, TL, John Villwock, AzureBlack, Brad, Pedro Madruga, Caitlyn Gatomon, K, jinyuan sun, Mano Prime, Alex, Jeffrey Morgan, Alicia Loh, Illia Dulskyi, Chadd, transmissions 11, fincy, Rainer Wilmers, ReadyPlayerEmma, knownsqashed, Mandus, biorpg, Deo Leter, Brandon Phillips, SuperWojo, Sean Connelly, Iucharbius, Jack West, Harry Royden McLaughlin, Nicholas, terasurfer, Vitor Caleffi, Duane Dunston, Johann-Peter Hartmann, David Ziegler, Olakabola, Ken Nordquist, Trenton Dambrowitz, Tom X Nguyen, Vadim, Ajan Kanaga, Leonard Tan, Clay Pascal, Alexandros Triantafyllidis, JM33133, Xule, vamX, ya boyyy, subjectnull, Talal Aujan, Alps Aficionado, wassieverse, Ari Malik, James Bentley, Woland, Spencer Kim, Michael Dempsey, Fred von Graf, Elle, zynix, William Richards, Stanislav Ovsiannikov, Edmond Seymore, Jonathan Leane, Martin Kemka, usrbinkat, Enrico Ros Thank you to all my generous patrons and donaters! And thank you again to a16z for their generous grant. # Original model card: Mistral AI_'s Mixtral 8X7B Instruct v0.1 # Model Card for Mixtral-8x7B The Mixtral-8x7B Large Language Model (LLM) is a pretrained generative Sparse Mixture of Experts. The Mixtral-8x7B outperforms Llama 2 70B on most benchmarks we tested. For full details of this model please read our release blog post. ## Warning This repo contains weights that are compatible with vLLM serving of the model as well as Hugging Face transformers library. It is based on the original Mixtral torrent release, but the file format and parameter names are different. Please note that model cannot (yet) be instantiated with HF. ## Instruction format This format must be strictly respected, otherwise the model will generate sub-optimal outputs. The template used to build a prompt for the Instruct model is defined as follows: Note that and are special tokens for beginning of string (BOS) and end of string (EOS) while [INST] and [/INST] are regular strings. As reference, here is the pseudo-code used to tokenize instructions during fine-tuning: In the pseudo-code above, note that the method should not add a BOS or EOS token automatically, but should add a prefix space. ## Run the model By default, transformers will load the model in full precision. Therefore you might be interested to further reduce down the memory requirements to run the model through the optimizations we offer in HF ecosystem: ### In half-precision Note precision only works on GPU devices Click to expand ### Lower precision using (8-bit & 4-bit) using Click to expand ### Load the model with Flash Attention 2 Click to expand ## Limitations The Mixtral-8x7B Instruct model is a quick demonstration that the base model can be easily fine-tuned to achieve compelling performance. It does not have any moderation mechanisms. We're looking forward to engaging with the community on ways to make the model finely respect guardrails, allowing for deployment in environments requiring moderated outputs. # The Mistral AI Team Albert Jiang, Alexandre Sablayrolles, Arthur Mensch, Blanche Savary, Chris Bamford, Devendra Singh Chaplot, Diego de las Casas, Emma Bou Hanna, Florian Bressand, Gianna Lengyel, Guillaume Bour, Guillaume Lample, Lélio Renard Lavaud, Louis Ternon, Lucile Saulnier, Marie-Anne Lachaux, Pierre Stock, Teven Le Scao, Théophile Gervet, Thibaut Lavril, Thomas Wang, Timothée Lacroix, William El Sayed.",
+ "model_explanation_gemini": "A quantized version of Mistral AI's Mixtral-8x7B-Instruct-v0.1 model optimized for efficient inference using AWQ (4-bit quantization), supporting multilingual text generation with a pirate-themed example demonstrating its conversational capabilities. \n\n**Features:** \n- Base model: `mistralai/Mixtral-8x7B-Instruct-v0.1` \n- Quantization: 4-bit AWQ for faster inference with comparable accuracy to GPTQ \n-"
+}
\ No newline at end of file
diff --git a/model_data_json/UBC-NLP_MARBERT.json b/model_data_json/UBC-NLP_MARBERT.json
new file mode 100644
index 0000000000000000000000000000000000000000..27f3a663c3a30fa8355fb66c296fb1f5fa39c166
--- /dev/null
+++ b/model_data_json/UBC-NLP_MARBERT.json
@@ -0,0 +1,22 @@
+{
+ "model_id": "UBC-NLP/MARBERT",
+ "downloads": 80475,
+ "tags": [
+ "transformers",
+ "pytorch",
+ "tf",
+ "jax",
+ "bert",
+ "fill-mask",
+ "Arabic BERT",
+ "MSA",
+ "Twitter",
+ "Masked Langauge Model",
+ "ar",
+ "autotrain_compatible",
+ "endpoints_compatible",
+ "region:us"
+ ],
+ "description": "--- language: - ar tags: - Arabic BERT - MSA - Twitter - Masked Langauge Model widget: - text: \"اللغة العربية هي لغة [MASK].\" --- **MARBERT** is one of three models described in our **ACL 2021 paper** **\"ARBERT & MARBERT: Deep Bidirectional Transformers for Arabic\"**. MARBERT is a large-scale pre-trained masked language model focused on both Dialectal Arabic (DA) and MSA. Arabic has multiple varieties. To train MARBERT, we randomly sample 1B Arabic tweets from a large in-house dataset of about 6B tweets. We only include tweets with at least 3 Arabic words, based on character string matching, regardless whether the tweet has non-Arabic string or not. That is, we do not remove non-Arabic so long as the tweet meets the 3 Arabic word criterion. The dataset makes up **128GB of text** (**15.6B tokens**). We use the same network architecture as ARBERT (BERT-base), but without the next sentence prediction (NSP) objective since tweets are short. See our repo for modifying BERT code to remove NSP. For more information about MARBERT, please visit our own GitHub repo. # BibTex If you use our models (ARBERT, MARBERT, or MARBERTv2) for your scientific publication, or if you find the resources in this repository useful, please cite our paper as follows (to be updated): ## Acknowledgments We gratefully acknowledge support from the Natural Sciences and Engineering Research Council of Canada, the Social Sciences and Humanities Research Council of Canada, Canadian Foundation for Innovation, ComputeCanada and UBC ARC-Sockeye. We also thank the Google TensorFlow Research Cloud (TFRC) program for providing us with free TPU access.",
+ "model_explanation_gemini": "\"MARBERT is a large-scale Arabic BERT model pre-trained on 1B Arabic tweets for both Dialectal Arabic and Modern Standard Arabic, optimized for masked language modeling without next sentence prediction.\"\n\nModel Features: \n- Language: Arabic (Dialectal and Modern Standard) \n- Training Data: 1B Arabic tweets (128GB, 15.6B tokens) \n- Architecture: BERT-base (no NSP objective) \n- Use Case: Masked language modeling"
+}
\ No newline at end of file
diff --git a/model_data_json/ZhenYe234_hubert_base_general_audio.json b/model_data_json/ZhenYe234_hubert_base_general_audio.json
new file mode 100644
index 0000000000000000000000000000000000000000..a6fac9270cfa025c8c2b896117318f124c9a9266
--- /dev/null
+++ b/model_data_json/ZhenYe234_hubert_base_general_audio.json
@@ -0,0 +1,13 @@
+{
+ "model_id": "ZhenYe234/hubert_base_general_audio",
+ "downloads": 68883,
+ "tags": [
+ "safetensors",
+ "hubert",
+ "arxiv:2408.17175",
+ "license:apache-2.0",
+ "region:us"
+ ],
+ "description": "--- license: apache-2.0 --- "
+}
\ No newline at end of file
diff --git a/model_data_json/aglazkova_bart_finetuned_keyphrase_extraction.json b/model_data_json/aglazkova_bart_finetuned_keyphrase_extraction.json
new file mode 100644
index 0000000000000000000000000000000000000000..3b6080436cb1320579d54d960e2f390b326155a8
--- /dev/null
+++ b/model_data_json/aglazkova_bart_finetuned_keyphrase_extraction.json
@@ -0,0 +1,22 @@
+{
+ "model_id": "aglazkova/bart_finetuned_keyphrase_extraction",
+ "downloads": 83042,
+ "tags": [
+ "transformers",
+ "pytorch",
+ "bart",
+ "text2text-generation",
+ "en",
+ "dataset:midas/krapivin",
+ "dataset:midas/inspec",
+ "dataset:midas/kptimes",
+ "dataset:midas/duc2001",
+ "arxiv:1910.13461",
+ "arxiv:2312.10700",
+ "autotrain_compatible",
+ "endpoints_compatible",
+ "region:us"
+ ],
+ "description": "--- datasets: - midas/krapivin - midas/inspec - midas/kptimes - midas/duc2001 language: - en widget: - text: \"Relevance has traditionally been linked with feature subset selection, but formalization of this link has not been attempted. In this paper, we propose two axioms for feature subset selection sufficiency axiom and necessity axiombased on which this link is formalized: The expected feature subset is the one which maximizes relevance. Finding the expected feature subset turns out to be NP-hard. We then devise a heuristic algorithm to find the expected subset which has a polynomial time complexity. The experimental results show that the algorithm finds good enough subset of features which, when presented to C4.5, results in better prediction accuracy.\" - text: \"In this paper, we investigate cross-domain limitations of keyphrase generation using the models for abstractive text summarization. We present an evaluation of BART fine-tuned for keyphrase generation across three types of texts, namely scientific texts from computer science and biomedical domains and news texts. We explore the role of transfer learning between different domains to improve the model performance on small text corpora.\" --- # BART fine-tuned for keyphrase generation This is the ) model finetuned for the keyphrase generation task () on the fragments of the following corpora: * Krapivin () * Inspec () * KPTimes () * DUC-2001 () * PubMed () * NamedKeys (). #### Training Hyperparameters The following hyperparameters were used during training: * learning_rate: 4e-5 * train_batch_size: 8 * optimizer: AdamW with betas=(0.9,0.999) and epsilon=1e-08 * num_epochs: 6 **BibTeX:**",
+ "model_explanation_gemini": "\"Finetuned BART model for generating keyphrases from scientific and news texts, trained on multiple datasets including Krapivin, Inspec, and KPTimes.\"\n\nModel Features: \n- Task: Keyphrase generation \n- Base Model: BART \n- Training Datasets: Krapivin, Inspec, KPTimes, DUC-2001, PubMed, NamedKeys \n- Supported Language: English \n- Training Hyperparameters: \n - Learning rate: 4e"
+}
\ No newline at end of file
diff --git a/model_data_json/aipicasso_emi.json b/model_data_json/aipicasso_emi.json
new file mode 100644
index 0000000000000000000000000000000000000000..86935d52a35f895473829d8efad985ec3002f89a
--- /dev/null
+++ b/model_data_json/aipicasso_emi.json
@@ -0,0 +1,18 @@
+{
+ "model_id": "aipicasso/emi",
+ "downloads": 274385,
+ "tags": [
+ "diffusers",
+ "safetensors",
+ "stable-diffusion",
+ "text-to-image",
+ "arxiv:2307.01952",
+ "arxiv:2212.03860",
+ "license:openrail++",
+ "autotrain_compatible",
+ "diffusers:StableDiffusionXLPipeline",
+ "region:us"
+ ],
+ "description": "--- extra_gated_prompt: このモデルをこのページからダウンロードするためにはHugging Faceに登録された情報を提供する必要があります。この提供された情報は画像生成AIを活用する情報を案内するために使われます。 To download this model from this page, you need to provide information registered with Hugging Face. The information provided will be used to guide you on how to utilize the image-generation AI. license: openrail++ tags: - stable-diffusion - text-to-image inference: false library_name: diffusers --- # Emi Model Card !eyecatch.jpg Original(PNG) English: Click Here # はじめに Emi (Ethereal master of illustration) は、 最先端の開発機材H100と画像生成Stable Diffusion XL 1.0を用いて AI Picasso社が開発したAIアートに特化した画像生成AIです。 このモデルの特徴として、Danbooruなどにある無断転載画像を学習していないことがあげられます。 # ライセンスについて ライセンスについては、これまでとは違い、 CreativeML Open RAIL++-M License です。 したがって、**商用利用可能**です。 これは次のように判断したためです。 - 画像生成AIが普及するに伴い、創作業界に悪影響を及ぼさないように、マナーを守る人が増えてきたため - 他の画像生成AIが商用可能である以上、あまり非商用ライセンスである実効性がなくなってきたため # 使い方 ここからデモを利用することができます。 本格的に利用する人はここからモデルをダウンロードできます。 通常版で生成がうまく行かない場合は、安定版をお使いください。 # シンプルな作品例 !example_1.jpg !example_2.png !example_3.jpg # モデルの出力向上について - 確実にアニメ調のイラストを出したいときは、anime artwork, anime styleとプロンプトの先頭に入れてください。 - プロンプトにtransparentという言葉を入れると、より最近の画風になります。 - 全身 (full body) を描くとうまく行かない場合もあるため、そのときは安定版をお試しください。 - 使えるプロンプトはWaifu Diffusionと同じです。また、Stable Diffusionのように使うこともできます。 - ネガティブプロンプトにTextual Inversionを使用することをおすすめします。 - 手が不安定なため、DreamShaper XL1.0などの実写系モデルとのマージをおすすめします。 - ChatGPTを用いてプロンプトを洗練すると、自分の枠を超えた作品に出会えます。 - 最新のComfyUIにあるFreeUノード、またはWeb UIの拡張機能を次のパラメータで使うとさらに出力が上がる可能性があります。次の画像はFreeUを使った例です。 - b1 = 1.1, b2 = 1.2, s1 = 0.6, s2 = 0.4 report !example_4.png # 法律について 本モデルは日本にて作成されました。したがって、日本の法律が適用されます。 本モデルの学習は、著作権法第30条の4に基づき、合法であると主張します。 また、本モデルの配布については、著作権法や刑法175条に照らしてみても、 正犯や幇助犯にも該当しないと主張します。詳しくは柿沼弁護士の見解を御覧ください。 ただし、ライセンスにもある通り、本モデルの生成物は各種法令に従って取り扱って下さい。 # 連絡先 support@aipicasso.app 以下、一般的なモデルカードの日本語訳です。 ## モデル詳細 - **モデルタイプ:** 拡散モデルベースの text-to-image 生成モデル - **言語:** 日本語 - **ライセンス:** CreativeML Open RAIL++-M License - **モデルの説明:** このモデルはプロンプトに応じて適切な画像を生成することができます。アルゴリズムは Latent Diffusion Model と OpenCLIP-ViT/G、CLIP-L です。 - **補足:** - **参考文献:** ## モデルの使用例 Stable Diffusion XL 1.0と同じ使い方です。 たくさんの方法がありますが、3つのパターンを提供します。 - ComfyUI - Fooocus - Diffusers ### ComfyUIやFooocusの場合 Stable Diffusion XL 1.0 の使い方と同じく、safetensor形式のモデルファイルを使ってください。 詳しいインストール方法は、こちらの記事を参照してください。 ### Diffusersの場合 🤗's Diffusers library を使ってください。 まずは、以下のスクリプトを実行し、ライブラリをいれてください。 次のスクリプトを実行し、画像を生成してください。 複雑な操作はデモのソースコードを参考にしてください。 #### 想定される用途 - イラストや漫画、アニメの作画補助 - 商用・非商用は問わない - 依頼の際のクリエイターとのコミュニケーション - 画像生成サービスの商用提供 - 生成物の取り扱いには注意して使ってください。 - 自己表現 - このAIを使い、「あなた」らしさを発信すること - 研究開発 - Discord上でのモデルの利用 - プロンプトエンジニアリング - ファインチューニング(追加学習とも) - DreamBooth など - 他のモデルとのマージ - 本モデルの性能をFIDなどで調べること - 本モデルがStable Diffusion以外のモデルとは独立であることをチェックサムやハッシュ関数などで調べること - 教育 - 美大生や専門学校生の卒業制作 - 大学生の卒業論文や課題制作 - 先生が画像生成AIの現状を伝えること - Hugging Face の Community にかいてある用途 - 日本語か英語で質問してください #### 想定されない用途 - 物事を事実として表現するようなこと - 先生を困らせるようなこと - その他、創作業界に悪影響を及ぼすこと # 使用してはいけない用途や悪意のある用途 - マネー・ロンダリングに用いないでください - デジタル贋作 (Digital Forgery) は公開しないでください(著作権法に違反するおそれ) - 他人の作品を無断でImage-to-Imageしないでください(著作権法に違反するおそれ) - わいせつ物を頒布しないでください (刑法175条に違反するおそれ) - いわゆる業界のマナーを守らないようなこと - 事実に基づかないことを事実のように語らないようにしてください(威力業務妨害罪が適用されるおそれ) - フェイクニュース ## モデルの限界やバイアス ### モデルの限界 - 拡散モデルや大規模言語モデルは、いまだに未知の部分が多く、その限界は判明していない。 ### バイアス - 拡散モデルや大規模言語モデルは、いまだに未知の部分が多く、バイアスは判明していない。 ## 学習 **学習データ** - Stable Diffusionと同様のデータセットからDanbooruの無断転載画像を取り除いて手動で集めた約2000枚の画像 - Stable Diffusionと同様のデータセットからDanbooruの無断転載画像を取り除いて自動で集めた約50万枚の画像 **学習プロセス** - **ハードウェア:** H100 ## 評価結果 第三者による評価を求めています。 ## 環境への影響 - **ハードウェアタイプ:** H100 - **使用時間(単位は時間):** 500 - **学習した場所:** 日本 ## 参考文献",
+ "model_explanation_gemini": "Generates anime-style illustrations using Stable Diffusion XL 1.0, avoiding unauthorized Danbooru images, and allows commercial use under the Open RAIL++-M license.\n\n**Features:** \n- Text-to-image diffusion model (Stable Diffusion XL-based) \n- Specialized for anime artwork generation \n- Commercial use permitted \n- Optimized prompts (e.g., \"anime style\", \"transparent\") \n- Compatible with Waifu Diffusion/Stable Diffusion prompts \n-"
+}
\ No newline at end of file
diff --git a/model_data_json/allegro_herbert-base-cased.json b/model_data_json/allegro_herbert-base-cased.json
new file mode 100644
index 0000000000000000000000000000000000000000..b5f859cf31483c303ef637e25ce30ebbb2d56389
--- /dev/null
+++ b/model_data_json/allegro_herbert-base-cased.json
@@ -0,0 +1,19 @@
+{
+ "model_id": "allegro/herbert-base-cased",
+ "downloads": 78442,
+ "tags": [
+ "transformers",
+ "pytorch",
+ "tf",
+ "jax",
+ "bert",
+ "feature-extraction",
+ "herbert",
+ "pl",
+ "license:cc-by-4.0",
+ "endpoints_compatible",
+ "region:us"
+ ],
+ "description": "--- language: pl tags: - herbert license: cc-by-4.0 --- # HerBERT **HerBERT** is a BERT-based Language Model trained on Polish corpora using Masked Language Modelling (MLM) and Sentence Structural Objective (SSO) with dynamic masking of whole words. For more details, please refer to: HerBERT: Efficiently Pretrained Transformer-based Language Model for Polish. Model training and experiments were conducted with transformers in version 2.9. ## Corpus HerBERT was trained on six different corpora available for Polish language: | Corpus | Tokens | Documents | | :------ | ------: | ------: | | CCNet Middle | 3243M | 7.9M | | CCNet Head | 2641M | 7.0M | | National Corpus of Polish| 1357M | 3.9M | | Open Subtitles | 1056M | 1.1M | Wikipedia | 260M | 1.4M | | Wolne Lektury | 41M | 5.5k | ## Tokenizer The training dataset was tokenized into subwords using a character level byte-pair encoding (``. ## Usage Example code: ## License CC BY 4.0 ## Citation If you use this model, please cite the following paper: ## Authors The model was trained by **Machine Learning Research Team at Allegro** and **Linguistic Engineering Group at Institute of Computer Science, Polish Academy of Sciences**. You can contact us at: klejbenchmark@allegro.pl",
+ "model_explanation_gemini": "\"HerBERT is a BERT-based Polish language model trained with MLM and SSO objectives on diverse Polish corpora for natural language processing tasks.\"\n\nModel Features: \n- BERT-based architecture \n- Trained on Polish corpora (CCNet, National Corpus of Polish, Open Subtitles, Wikipedia, Wolne Lektury) \n- Uses Masked Language Modeling (MLM) and Sentence Structural Objective (SSO) \n- Implements dynamic whole-word masking \n- Byte-pair"
+}
\ No newline at end of file
diff --git a/model_data_json/answerdotai_ModernBERT-large.json b/model_data_json/answerdotai_ModernBERT-large.json
new file mode 100644
index 0000000000000000000000000000000000000000..8b040e94b6b28036ce2ede3f434c96ee5c3270e5
--- /dev/null
+++ b/model_data_json/answerdotai_ModernBERT-large.json
@@ -0,0 +1,21 @@
+{
+ "model_id": "answerdotai/ModernBERT-large",
+ "downloads": 80863,
+ "tags": [
+ "transformers",
+ "pytorch",
+ "onnx",
+ "safetensors",
+ "modernbert",
+ "fill-mask",
+ "masked-lm",
+ "long-context",
+ "en",
+ "arxiv:2412.13663",
+ "license:apache-2.0",
+ "autotrain_compatible",
+ "region:us"
+ ],
+ "description": "--- library_name: transformers license: apache-2.0 language: - en tags: - fill-mask - masked-lm - long-context - modernbert pipeline_tag: fill-mask inference: false --- # ModernBERT ## Table of Contents 1. Model Summary 2. Usage 3. Evaluation 4. Limitations 5. Training 6. License 7. Citation ## Model Summary ModernBERT is a modernized bidirectional encoder-only Transformer model (BERT-style) pre-trained on 2 trillion tokens of English and code data with a native context length of up to 8,192 tokens. ModernBERT leverages recent architectural improvements such as: - **Rotary Positional Embeddings (RoPE)** for long-context support. - **Local-Global Alternating Attention** for efficiency on long inputs. - **Unpadding and Flash Attention** for efficient inference. ModernBERT’s native long context length makes it ideal for tasks that require processing long documents, such as retrieval, classification, and semantic search within large corpora. The model was trained on a large corpus of text and code, making it suitable for a wide range of downstream tasks, including code retrieval and hybrid (text + code) semantic search. It is available in the following sizes: - ModernBERT-base - 22 layers, 149 million parameters - ModernBERT-large - 28 layers, 395 million parameters For more information about ModernBERT, we recommend our release blog post for a high-level overview, and our arXiv pre-print for in-depth information. *ModernBERT is a collaboration between Answer.AI, LightOn, and friends.* ## Usage You can use these models directly with the library starting from v4.48.0: Since ModernBERT is a Masked Language Model (MLM), you can use the pipeline or load it via . To use ModernBERT for downstream tasks like classification, retrieval, or QA, fine-tune it following standard BERT fine-tuning recipes. **⚠️ If your GPU supports it, we recommend using ModernBERT with Flash Attention 2 to reach the highest efficiency. To do so, install Flash Attention as follows, then use the model as normal:** Using : Using a pipeline: **Note:** ModernBERT does not use token type IDs, unlike some earlier BERT models. Most downstream usage is identical to standard BERT models on the Hugging Face Hub, except you can omit the parameter. ## Evaluation We evaluate ModernBERT across a range of tasks, including natural language understanding (GLUE), general retrieval (BEIR), long-context retrieval (MLDR), and code retrieval (CodeSearchNet and StackQA). **Key highlights:** - On GLUE, ModernBERT-base surpasses other similarly-sized encoder models, and ModernBERT-large is second only to Deberta-v3-large. - For general retrieval tasks, ModernBERT performs well on BEIR in both single-vector (DPR-style) and multi-vector (ColBERT-style) settings. - Thanks to the inclusion of code data in its training mixture, ModernBERT as a backbone also achieves new state-of-the-art code retrieval results on CodeSearchNet and StackQA. ### Base Models | Model | IR (DPR) | IR (DPR) | IR (DPR) | IR (ColBERT) | IR (ColBERT) | NLU | Code | Code | |-------------|--------------|--------------|--------------|---------------|---------------|------|------|------| | | BEIR | MLDR_OOD | MLDR_ID | BEIR | MLDR_OOD | GLUE | CSN | SQA | | BERT | 38.9 | 23.9 | 32.2 | 49.0 | 28.1 | 84.7 | 41.2 | 59.5 | | RoBERTa | 37.7 | 22.9 | 32.8 | 48.7 | 28.2 | 86.4 | 44.3 | 59.6 | | DeBERTaV3 | 20.2 | 5.4 | 13.4 | 47.1 | 21.9 | 88.1 | 17.5 | 18.6 | | NomicBERT | 41.0 | 26.7 | 30.3 | 49.9 | 61.3 | 84.0 | 41.6 | 61.4 | | GTE-en-MLM | 41.4 | **34.3** |**44.4** | 48.2 | 69.3 | 85.6 | 44.9 | 71.4 | | ModernBERT | **41.6** | 27.4 | 44.0 | **51.3** | **80.2** | **88.4** | **56.4** |**73.6**| --- ### Large Models | Model | IR (DPR) | IR (DPR) | IR (DPR) | IR (ColBERT) | IR (ColBERT) | NLU | Code | Code | |-------------|--------------|--------------|--------------|---------------|---------------|------|------|------| | | BEIR | MLDR_OOD | MLDR_ID | BEIR | MLDR_OOD | GLUE | CSN | SQA | | BERT | 38.9 | 23.3 | 31.7 | 49.5 | 28.5 | 85.2 | 41.6 | 60.8 | | RoBERTa | 41.4 | 22.6 | 36.1 | 49.8 | 28.8 | 88.9 | 47.3 | 68.1 | | DeBERTaV3 | 25.6 | 7.1 | 19.2 | 46.7 | 23.0 | **91.4**| 21.2 | 19.7 | | GTE-en-MLM | 42.5 | **36.4** | **48.9** | 50.7 | 71.3 | 87.6 | 40.5 | 66.9 | | ModernBERT | **44.0** | 34.3 | 48.6 | **52.4** | **80.4** | 90.4 |**59.5** |**83.9**| *Table 1: Results for all models across an overview of all tasks. CSN refers to CodeSearchNet and SQA to StackQA. MLDRID refers to in-domain (fine-tuned on the training set) evaluation, and MLDR_OOD to out-of-domain.* ModernBERT’s strong results, coupled with its efficient runtime on long-context inputs, demonstrate that encoder-only models can be significantly improved through modern architectural choices and extensive pretraining on diversified data sources. ## Limitations ModernBERT’s training data is primarily English and code, so performance may be lower for other languages. While it can handle long sequences efficiently, using the full 8,192 tokens window may be slower than short-context inference. Like any large language model, ModernBERT may produce representations that reflect biases present in its training data. Verify critical or sensitive outputs before relying on them. ## Training - Architecture: Encoder-only, Pre-Norm Transformer with GeGLU activations. - Sequence Length: Pre-trained up to 1,024 tokens, then extended to 8,192 tokens. - Data: 2 trillion tokens of English text and code. - Optimizer: StableAdamW with trapezoidal LR scheduling and 1-sqrt decay. - Hardware: Trained on 8x H100 GPUs. See the paper for more details. ## License We release the ModernBERT model architectures, model weights, training codebase under the Apache 2.0 license. ## Citation If you use ModernBERT in your work, please cite:",
+ "model_explanation_gemini": "\"ModernBERT-large is a BERT-style model optimized for long-context tasks (up to 8,192 tokens) with architectural improvements like RoPE and Flash Attention, excelling in retrieval, NLU, and code-related tasks.\"\n\n**Features:** \n- **Architecture:** Rotary Positional Embeddings (RoPE), Local-Global Alternating Attention, Unpadding, Flash Attention support. \n- **Context Length:** Native 8,192-token handling. \n- **Training"
+}
\ No newline at end of file
diff --git a/model_data_json/ashawkey_mvdream-sd2.1-diffusers.json b/model_data_json/ashawkey_mvdream-sd2.1-diffusers.json
new file mode 100644
index 0000000000000000000000000000000000000000..b7fdd955356b266e26c1f6faee9547ddf26d2e29
--- /dev/null
+++ b/model_data_json/ashawkey_mvdream-sd2.1-diffusers.json
@@ -0,0 +1,14 @@
+{
+ "model_id": "ashawkey/mvdream-sd2.1-diffusers",
+ "downloads": 78372,
+ "tags": [
+ "diffusers",
+ "safetensors",
+ "arxiv:2308.16512",
+ "license:openrail",
+ "diffusers:MVDreamPipeline",
+ "region:us"
+ ],
+ "description": "--- license: openrail --- # MVDream-diffusers Model Card This is a port of into diffusers. For usage, please check: ## Citation ## Misuse, Malicious Use, and Out-of-Scope Use The model should not be used to intentionally create or disseminate images that create hostile or alienating environments for people. This includes generating images that people would foreseeably find disturbing, distressing, or offensive; or content that propagates historical or current stereotypes.",
+ "model_explanation_gemini": "Port of MVDream-diffusers for generating multi-view images with Stable Diffusion 2.1.\n\nFeatures: \n- Multi-view image generation \n- Based on Stable Diffusion 2.1 \n- Diffusers integration \n\nComparison: Not applicable (no comparative details provided in the description)."
+}
\ No newline at end of file
diff --git a/model_data_json/aubmindlab_bert-base-arabert.json b/model_data_json/aubmindlab_bert-base-arabert.json
new file mode 100644
index 0000000000000000000000000000000000000000..ba689345b8f6993e87f9ecbb400ce5bbf8716dde
--- /dev/null
+++ b/model_data_json/aubmindlab_bert-base-arabert.json
@@ -0,0 +1,20 @@
+{
+ "model_id": "aubmindlab/bert-base-arabert",
+ "downloads": 74124,
+ "tags": [
+ "transformers",
+ "pytorch",
+ "tf",
+ "jax",
+ "safetensors",
+ "bert",
+ "fill-mask",
+ "ar",
+ "arxiv:2003.00104",
+ "autotrain_compatible",
+ "endpoints_compatible",
+ "region:us"
+ ],
+ "description": "--- language: ar datasets: - wikipedia - Osian - 1.5B-Arabic-Corpus - oscar-arabic-unshuffled - Assafir(private) widget: - text: \" عاصم +ة لبنان هي [MASK] .\" --- # !!! A newer version of this model is available !!! AraBERTv2 # AraBERT v1 & v2 : Pre-training BERT for Arabic Language Understanding **AraBERT** is an Arabic pretrained lanaguage model based on Google's BERT architechture. AraBERT uses the same BERT-Base config. More details are available in the AraBERT Paper and in the AraBERT Meetup There are two versions of the model, AraBERTv0.1 and AraBERTv1, with the difference being that AraBERTv1 uses pre-segmented text where prefixes and suffixes were splitted using the Farasa Segmenter. We evalaute AraBERT models on different downstream tasks and compare them to mBERT), and other state of the art models (*To the extent of our knowledge*). The Tasks were Sentiment Analysis on 6 different datasets (HARD, ASTD-Balanced, ArsenTD-Lev, LABR), Named Entity Recognition with the ANERcorp, and Arabic Question Answering on Arabic-SQuAD and ARCD # AraBERTv2 ## What's New! AraBERT now comes in 4 new variants to replace the old v1 versions: More Detail in the AraBERT folder and in the README and in the AraBERT Paper Model | HuggingFace Model Name | Size (MB/Params)| Pre-Segmentation | DataSet (Sentences/Size/nWords) | ---|:---:|:---:|:---:|:---: AraBERTv0.2-base | bert-base-arabertv02 | 543MB / 136M | No | 200M / 77GB / 8.6B | AraBERTv0.2-large| bert-large-arabertv02 | 1.38G 371M | No | 200M / 77GB / 8.6B | AraBERTv2-base| bert-base-arabertv2 | 543MB 136M | Yes | 200M / 77GB / 8.6B | AraBERTv2-large| bert-large-arabertv2 | 1.38G 371M | Yes | 200M / 77GB / 8.6B | AraBERTv0.1-base| bert-base-arabertv01 | 543MB 136M | No | 77M / 23GB / 2.7B | AraBERTv1-base| bert-base-arabert | 543MB 136M | Yes | 77M / 23GB / 2.7B | All models are available in the model page under the aubmindlab name. Checkpoints are available in PyTorch, TF2 and TF1 formats. ## Better Pre-Processing and New Vocab We identified an issue with AraBERTv1's wordpiece vocabulary. The issue came from punctuations and numbers that were still attached to words when learned the wordpiece vocab. We now insert a space between numbers and characters and around punctuation characters. The new vocabulary was learnt using the from the library, and should now support the Fast tokenizer implementation from the library. **P.S.**: All the old BERT codes should work with the new BERT, just change the model name and check the new preprocessing dunction **Please read the section on how to use the preprocessing function** ## Bigger Dataset and More Compute We used ~3.5 times more data, and trained for longer. For Dataset Sources see the Dataset Section Model | Hardware | num of examples with seq len (128 / 512) |128 (Batch Size/ Num of Steps) | 512 (Batch Size/ Num of Steps) | Total Steps | Total Time (in Days) | ---|:---:|:---:|:---:|:---:|:---:|:---: AraBERTv0.2-base | TPUv3-8 | 420M / 207M |2560 / 1M | 384/ 2M | 3M | - AraBERTv0.2-large | TPUv3-128 | 420M / 207M | 13440 / 250K | 2056 / 300K | 550K | - AraBERTv2-base | TPUv3-8 | 520M / 245M |13440 / 250K | 2056 / 300K | 550K | - AraBERTv2-large | TPUv3-128 | 520M / 245M | 13440 / 250K | 2056 / 300K | 550K | - AraBERT-base (v1/v0.1) | TPUv2-8 | - |512 / 900K | 128 / 300K| 1.2M | 4 days # Dataset The pretraining data used for the new AraBERT model is also used for Arabic **GPT2 and ELECTRA**. The dataset consists of 77GB or 200,095,961 lines or 8,655,948,860 words or 82,232,988,358 chars (before applying Farasa Segmentation) For the new dataset we added the unshuffled OSCAR corpus, after we thoroughly filter it, to the previous dataset used in AraBERTv1 but with out the websites that we previously crawled: - OSCAR unshuffled and filtered. - Arabic Wikipedia dump from 2020/09/01 - The 1.5B words Arabic Corpus - The OSIAN Corpus - Assafir news articles. Huge thank you for Assafir for giving us the data # Preprocessing It is recommended to apply our preprocessing function before training/testing on any dataset. **Install farasapy to segment text for AraBERT v1 & v2 ** ## Accepted_models # TensorFlow 1.x models The TF1.x model are available in the HuggingFace models repo. You can download them as follows: - via git-lfs: clone all the models in a repo where is any model under the name - via : - Go to the tf1_model.tar.gz file on huggingface.co/models/aubmindlab/MODEL_NAME. - copy the - then run (ex: for : ) # If you used this model please cite us as : Google Scholar has our Bibtex wrong (missing name), use this instead # Acknowledgments Thanks to TensorFlow Research Cloud (TFRC) for the free access to Cloud TPUs, couldn't have done it without this program, and to the AUB MIND Lab Members for the continous support. Also thanks to Yakshof and Assafir for data and storage access. Another thanks for Habib Rahal ( for putting a face to AraBERT. ## Contacts **Wissam Antoun**: Linkedin | Twitter | Github | | **Fady Baly**: Linkedin | Twitter | Github | | ",
+ "model_explanation_gemini": "Pretrained BERT-based model for Arabic language understanding, optimized for tasks like sentiment analysis, named entity recognition, and question answering.\n\nFeatures: \n- Based on BERT-Base architecture \n- Trained on large Arabic datasets (Wikipedia, OSCAR, 1.5B-Arabic-Corpus, etc.) \n- Supports pre-segmented text (Farasa Segmenter in v1/v2) \n- Improved vocabulary handling (v2) \n- Available in base/large"
+}
\ No newline at end of file
diff --git a/model_data_json/autogluon_chronos-bolt-tiny.json b/model_data_json/autogluon_chronos-bolt-tiny.json
new file mode 100644
index 0000000000000000000000000000000000000000..ef01190bd84009e8357280cc38f25fb4ab7ede61
--- /dev/null
+++ b/model_data_json/autogluon_chronos-bolt-tiny.json
@@ -0,0 +1,21 @@
+{
+ "model_id": "autogluon/chronos-bolt-tiny",
+ "downloads": 69867,
+ "tags": [
+ "safetensors",
+ "t5",
+ "time series",
+ "forecasting",
+ "pretrained models",
+ "foundation models",
+ "time series foundation models",
+ "time-series",
+ "time-series-forecasting",
+ "arxiv:1910.10683",
+ "arxiv:2403.07815",
+ "license:apache-2.0",
+ "region:us"
+ ],
+ "description": "--- license: apache-2.0 pipeline_tag: time-series-forecasting tags: - time series - forecasting - pretrained models - foundation models - time series foundation models - time-series --- # Chronos-Bolt⚡ (Tiny) 🚀 **Update Feb 14, 2025**: Chronos-Bolt models are now available on Amazon SageMaker JumpStart! Check out the tutorial notebook to learn how to deploy Chronos endpoints for production use in a few lines of code. Chronos-Bolt is a family of pretrained time series forecasting models which can be used for zero-shot forecasting. It is based on the T5 encoder-decoder architecture and has been trained on nearly 100 billion time series observations. It chunks the historical time series context into patches of multiple observations, which are then input into the encoder. The decoder then uses these representations to directly generate quantile forecasts across multiple future steps—a method known as direct multi-step forecasting. Chronos-Bolt models are **more accurate**, up to **250 times faster** and **20 times more memory-efficient** than the original Chronos models of the same size. ## Performance The following plot compares the inference time of Chronos-Bolt against the original Chronos models for forecasting 1024 time series with a context length of 512 observations and a prediction horizon of 64 steps.
Chronos-Bolt models are not only significantly faster but also more accurate than the original Chronos models. The following plot reports the probabilistic and point forecasting performance of Chronos-Bolt in terms of the Weighted Quantile Loss (WQL) and the Mean Absolute Scaled Error (MASE), respectively, aggregated over 27 datasets (see the Chronos paper for details on this benchmark). Remarkably, despite having no prior exposure to these datasets during training, the zero-shot Chronos-Bolt models outperform commonly used statistical models and deep learning models that have been trained on these datasets (highlighted by *). Furthermore, they also perform better than other FMs, denoted by a +, which indicates that these models were pretrained on certain datasets in our benchmark and are not entirely zero-shot. Notably, Chronos-Bolt (Base) also surpasses the original Chronos (Large) model in terms of the forecasting accuracy while being over 600 times faster.
Chronos-Bolt models are available in the following sizes.
Twitter | LinkedIn | Discord | GitHub Discussions | Website | YouTube By the way: we're hiring!",
+ "model_explanation_gemini": "A German ELECTRA large discriminator model achieving state-of-the-art performance for German language tasks. \n\n**Features:** \n- Language: German \n- Architecture: ELECTRA large (discriminator) \n- Training data: Wikipedia, OPUS, OpenLegalData, OSCAR \n- License: MIT \n\n**Comparison:** \nOutperforms other German models like `deepset/gbert-base`, `deepset/gbert-large`, and smaller ELECTRA variants (`gelectra-base`), as"
+}
\ No newline at end of file
diff --git a/model_data_json/dima806_man_woman_face_image_detection.json b/model_data_json/dima806_man_woman_face_image_detection.json
new file mode 100644
index 0000000000000000000000000000000000000000..e4b5e5eddd1636c084cbc55565613e6e9cfe2a22
--- /dev/null
+++ b/model_data_json/dima806_man_woman_face_image_detection.json
@@ -0,0 +1,19 @@
+{
+ "model_id": "dima806/man_woman_face_image_detection",
+ "downloads": 74322,
+ "tags": [
+ "transformers",
+ "pytorch",
+ "safetensors",
+ "vit",
+ "image-classification",
+ "base_model:google/vit-base-patch16-224-in21k",
+ "base_model:finetune:google/vit-base-patch16-224-in21k",
+ "license:apache-2.0",
+ "autotrain_compatible",
+ "endpoints_compatible",
+ "region:us"
+ ],
+ "description": "--- license: apache-2.0 metrics: - accuracy - f1 base_model: - google/vit-base-patch16-224-in21k --- Returns with about 98.7% accuracy whether the face belongs to man or woman based on face image. See for more details. !image/png",
+ "model_explanation_gemini": "Detects whether a face in an image belongs to a man or woman with 98.7% accuracy. \n\n**Features:** \n- High accuracy (98.7%) \n- Uses Google's ViT-base-patch16-224-in21k as base model \n- Evaluated on accuracy and F1 metrics \n\n**Comparison:** (Not explicitly provided in the description, so omitted.)"
+}
\ No newline at end of file
diff --git a/model_data_json/ds4sd_SmolDocling-256M-preview.json b/model_data_json/ds4sd_SmolDocling-256M-preview.json
new file mode 100644
index 0000000000000000000000000000000000000000..c9c29b6c317838d282a35775fcde48e30be77419
--- /dev/null
+++ b/model_data_json/ds4sd_SmolDocling-256M-preview.json
@@ -0,0 +1,22 @@
+{
+ "model_id": "ds4sd/SmolDocling-256M-preview",
+ "downloads": 80269,
+ "tags": [
+ "transformers",
+ "onnx",
+ "safetensors",
+ "idefics3",
+ "image-text-to-text",
+ "conversational",
+ "en",
+ "arxiv:2503.11576",
+ "arxiv:2305.03393",
+ "base_model:HuggingFaceTB/SmolVLM-256M-Instruct",
+ "base_model:quantized:HuggingFaceTB/SmolVLM-256M-Instruct",
+ "license:cdla-permissive-2.0",
+ "endpoints_compatible",
+ "region:us"
+ ],
+ "description": "--- base_model: - HuggingFaceTB/SmolVLM-256M-Instruct language: - en library_name: transformers license: cdla-permissive-2.0 pipeline_tag: image-text-to-text ---
SmolDocling-256M-preview
SmolDocling is a multimodal Image-Text-to-Text model designed for efficient document conversion. It retains Docling's most popular features while ensuring full compatibility with Docling through seamless support for DoclingDocuments.
This model was presented in the paper SmolDocling: An ultra-compact vision-language model for end-to-end multi-modal document conversion. ### 🚀 Features: - 🏷️ **DocTags for Efficient Tokenization** – Introduces DocTags an efficient and minimal representation for documents that is fully compatible with **DoclingDocuments**. - 🔍 **OCR (Optical Character Recognition)** – Extracts text accurately from images. - 📐 **Layout and Localization** – Preserves document structure and document element **bounding boxes**. - 💻 **Code Recognition** – Detects and formats code blocks including identation. - 🔢 **Formula Recognition** – Identifies and processes mathematical expressions. - 📊 **Chart Recognition** – Extracts and interprets chart data. - 📑 **Table Recognition** – Supports column and row headers for structured table extraction. - 🖼️ **Figure Classification** – Differentiates figures and graphical elements. - 📝 **Caption Correspondence** – Links captions to relevant images and figures. - 📜 **List Grouping** – Organizes and structures list elements correctly. - 📄 **Full-Page Conversion** – Processes entire pages for comprehensive document conversion including all page elements (code, equations, tables, charts etc.) - 🔲 **OCR with Bounding Boxes** – OCR regions using a bounding box. - 📂 **General Document Processing** – Trained for both scientific and non-scientific documents. - 🔄 **Seamless Docling Integration** – Import into **Docling** and export in multiple formats. - 💨 **Fast inference using VLLM** – Avg of 0.35 secs per page on A100 GPU. ### 🚧 *Coming soon!* - 📊 **Better chart recognition 🛠️** - 📚 **One shot multi-page inference ⏱️** - 🧪 **Chemical Recognition** - 📙 **Datasets** ## ⌨️ Get started (code examples) You can use **transformers**, **vllm**, or **onnx** to perform inference, and Docling to convert results to variety of output formats (md, html, etc.): 📄 Single page image inference using Tranformers 🤖 🚀 Fast Batch Inference Using VLLM ONNX Inference 💻 Local inference on Apple Silicon with MLX: see here ## DocTags DocTags create a clear and structured system of tags and rules that separate text from the document's structure. This makes things easier for Image-to-Sequence models by reducing confusion. On the other hand, converting directly to formats like HTML or Markdown can be messy—it often loses details, doesn’t clearly show the document’s layout, and increases the number of tokens, making processing less efficient. DocTags are integrated with Docling, which allows export to HTML, Markdown, and JSON. These exports can be offloaded to the CPU, reducing token generation overhead and improving efficiency. ## Supported Instructions
Description
Instruction
Comment
Full conversion
Convert this page to docling.
DocTags represetation
Chart
Convert chart to table.
(e.g., <chart>)
Formula
Convert formula to LaTeX.
(e.g., <formula>)
Code
Convert code to text.
(e.g., <code>)
Table
Convert table to OTSL.
(e.g., <otsl>) OTSL:
Actions and Pipelines
OCR the text in a specific location: <loc_155><loc_233><loc_206><loc_237>
Identify element at: <loc_247><loc_482><10c_252><loc_486>
Find all 'text' elements on the page, retrieve all section headers.
Detect footer elements on the page.
#### Model Summary - **Developed by:** Docling Team, IBM Research - **Model type:** Multi-modal model (image+text) - **Language(s) (NLP):** English - **License:** Apache 2.0 - **Architecture:** Based on Idefics3 (see technical summary) - **Finetuned from model:** Based on SmolVLM-256M-Instruct **Repository:** Docling **Paper:** arXiv **Project Page:** Hugging Face **Citation:** **Demo:** HF Space",
+ "model_explanation_gemini": "A multimodal image-text-to-text model designed for efficient document conversion, supporting OCR, layout preservation, code/formula/chart/table recognition, and seamless integration with Docling.\n\n**Model Features**: \n- DocTags for efficient tokenization \n- OCR with bounding boxes \n- Layout and element localization \n- Code, formula, chart, and table recognition \n- Figure classification and caption correspondence \n- List grouping and full-page conversion \n- Fast inference (0.35s/page on A100)"
+}
\ No newline at end of file
diff --git a/model_data_json/dslim_distilbert-NER.json b/model_data_json/dslim_distilbert-NER.json
new file mode 100644
index 0000000000000000000000000000000000000000..8e1a7664424bbef74426f71c46b9eeb3a41b333d
--- /dev/null
+++ b/model_data_json/dslim_distilbert-NER.json
@@ -0,0 +1,24 @@
+{
+ "model_id": "dslim/distilbert-NER",
+ "downloads": 70773,
+ "tags": [
+ "transformers",
+ "tensorboard",
+ "onnx",
+ "safetensors",
+ "distilbert",
+ "token-classification",
+ "en",
+ "dataset:conll2003",
+ "arxiv:1810.04805",
+ "arxiv:1910.01108",
+ "base_model:distilbert/distilbert-base-cased",
+ "base_model:quantized:distilbert/distilbert-base-cased",
+ "license:apache-2.0",
+ "autotrain_compatible",
+ "endpoints_compatible",
+ "region:us"
+ ],
+ "description": "--- license: apache-2.0 base_model: distilbert-base-cased metrics: - precision - recall - f1 - accuracy model-index: - name: distilbert-NER results: [] datasets: - conll2003 language: - en pipeline_tag: token-classification --- # distilbert-NER If my open source models have been useful to you, please consider supporting me in building small, useful AI models for everyone (and help me afford med school / help out my parents financially). Thanks! ## Model description **distilbert-NER** is the fine-tuned version of **DistilBERT**, which is a distilled variant of the BERT model. DistilBERT has fewer parameters than BERT, making it smaller, faster, and more efficient. distilbert-NER is specifically fine-tuned for the task of **Named Entity Recognition (NER)**. This model accurately identifies the same four types of entities as its BERT counterparts: location (LOC), organizations (ORG), person (PER), and Miscellaneous (MISC). Although it is a more compact model, distilbert-NER demonstrates a robust performance in NER tasks, balancing between size, speed, and accuracy. The model was fine-tuned on the English version of the CoNLL-2003 Named Entity Recognition dataset, which is widely recognized for its comprehensive and diverse range of entity types. ### Available NER models | Model Name | Description | Parameters | |-------------------|-------------|------------------| | distilbert-NER | Fine-tuned DistilBERT - a smaller, faster, lighter version of BERT | 66M | | bert-large-NER | Fine-tuned bert-large-cased - larger model with slightly better performance | 340M | | bert-base-NER-(uncased) | Fine-tuned bert-base, available in both cased and uncased versions | 110M | ## Intended uses & limitations #### How to use This model can be utilized with the Transformers *pipeline* for NER, similar to the BERT models. #### Limitations and bias The performance of distilbert-NER is linked to its training on the CoNLL-2003 dataset. Therefore, it might show limited effectiveness on text data that significantly differs from this training set. Users should be aware of potential biases inherent in the training data and the possibility of entity misclassification in complex sentences. ## Training data This model was fine-tuned on English version of the standard CoNLL-2003 Named Entity Recognition dataset. The training dataset distinguishes between the beginning and continuation of an entity so that if there are back-to-back entities of the same type, the model can output where the second entity begins. As in the dataset, each token will be classified as one of the following classes: Abbreviation|Description -|- O|Outside of a named entity B-MISC |Beginning of a miscellaneous entity right after another miscellaneous entity I-MISC | Miscellaneous entity B-PER |Beginning of a person’s name right after another person’s name I-PER |Person’s name B-ORG |Beginning of an organization right after another organization I-ORG |organization B-LOC |Beginning of a location right after another location I-LOC |Location ### CoNLL-2003 English Dataset Statistics This dataset was derived from the Reuters corpus which consists of Reuters news stories. You can read more about how this dataset was created in the CoNLL-2003 paper. #### # of training examples per entity type Dataset|LOC|MISC|ORG|PER -|-|-|-|- Train|7140|3438|6321|6600 Dev|1837|922|1341|1842 Test|1668|702|1661|1617 #### # of articles/sentences/tokens per dataset Dataset |Articles |Sentences |Tokens -|-|-|- Train |946 |14,987 |203,621 Dev |216 |3,466 |51,362 Test |231 |3,684 |46,435 ## Training procedure This model was trained on a single NVIDIA V100 GPU with recommended hyperparameters from the original BERT paper which trained & evaluated the model on CoNLL-2003 NER task. ## Eval results | Metric | Score | |------------|-------| | Loss | 0.0710| | Precision | 0.9202| | Recall | 0.9232| | F1 | 0.9217| | Accuracy | 0.9810| The training and validation losses demonstrate a decrease over epochs, signaling effective learning. The precision, recall, and F1 scores are competitive, showcasing the model's robustness in NER tasks. ### BibTeX entry and citation info For DistilBERT: For the underlying BERT model:",
+ "model_explanation_gemini": "\"Distilbert-NER is a compact, efficient named entity recognition model fine-tuned from DistilBERT to identify four entity types (LOC, ORG, PER, MISC) with robust performance, trained on the CoNLL-2003 dataset.\"\n\nFeatures: \n- Fine-tuned for Named Entity Recognition (NER) \n- Identifies entities: location (LOC), organizations (ORG), person (PER), Miscellaneous (MISC) \n- Based on DistilBERT (smaller"
+}
\ No newline at end of file
diff --git a/model_data_json/embaas_sentence-transformers-e5-large-v2.json b/model_data_json/embaas_sentence-transformers-e5-large-v2.json
new file mode 100644
index 0000000000000000000000000000000000000000..bd96fa3b63d439e2d0f38960ea8fede821b05ba3
--- /dev/null
+++ b/model_data_json/embaas_sentence-transformers-e5-large-v2.json
@@ -0,0 +1,17 @@
+{
+ "model_id": "embaas/sentence-transformers-e5-large-v2",
+ "downloads": 73837,
+ "tags": [
+ "sentence-transformers",
+ "pytorch",
+ "bert",
+ "feature-extraction",
+ "sentence-similarity",
+ "autotrain_compatible",
+ "text-embeddings-inference",
+ "endpoints_compatible",
+ "region:us"
+ ],
+ "description": "--- pipeline_tag: sentence-similarity tags: - sentence-transformers - feature-extraction - sentence-similarity --- # embaas/sentence-transformers-e5-large-v2 This is a the sentence-transformers version of the intfloat/e5-large-v2 model: It maps sentences & paragraphs to a 1024 dimensional dense vector space and can be used for tasks like clustering or semantic search. ## Usage (Sentence-Transformers) Using this model becomes easy when you have sentence-transformers installed: Then you can use the model like this: ## Using with API You can use the embaas API to encode your input. Get your free API key from embaas.io ## Evaluation Results Find the results of the e5 at the MTEB leaderboard ## Full Model Architecture ## Citing & Authors ",
+ "model_explanation_gemini": "Maps sentences and paragraphs to a 1024-dimensional vector space for tasks like clustering and semantic search. \n\n**Features:** \n- Sentence/paragraph embedding into 1024-dimensional vectors \n- Supports clustering and semantic search \n- Compatible with Sentence-Transformers and embaas API \n\n**Comparison:** \nThis model is the Sentence-Transformers adaptation of **intfloat/e5-large-v2**, offering similar dense vector representations but optimized for seamless integration with the Sentence-Transformers library and"
+}
\ No newline at end of file
diff --git a/model_data_json/evo-design_evo-1.5-8k-base.json b/model_data_json/evo-design_evo-1.5-8k-base.json
new file mode 100644
index 0000000000000000000000000000000000000000..34102f9b84cb4e9f7e6491f81e5180fd379ec10b
--- /dev/null
+++ b/model_data_json/evo-design_evo-1.5-8k-base.json
@@ -0,0 +1,19 @@
+{
+ "model_id": "evo-design/evo-1.5-8k-base",
+ "downloads": 74038,
+ "tags": [
+ "pytorch",
+ "safetensors",
+ "stripedhyena",
+ "long context",
+ "deep signal processing",
+ "hybrid",
+ "biology",
+ "genomics",
+ "custom_code",
+ "license:apache-2.0",
+ "region:us"
+ ],
+ "description": "--- license: apache-2.0 tags: - stripedhyena - long context - deep signal processing - hybrid - biology - genomics --- ## Evo 1.5
### About Evo is a biological foundation model capable of long-context modeling and design. Evo uses the StripedHyena architecture to enable modeling of sequences at a single-nucleotide, byte-level resolution with near-linear scaling of compute and memory relative to context length. Evo has 7 billion parameters and is trained on OpenGenome, a prokaryotic whole-genome dataset containing ~300 billion tokens. **Evo 1.5** is a version of Evo built off of the Evo 1 model pretrained at 8k context with training extended by 50% more training data, totaling 450 billion tokens. | Checkpoint Name | Description | |----------------------------------------|-------------| | | A model pretrained with 8,192 context obtained by extending the pretraining of to process 50% more training data. | | | A model pretrained with 8,192 context. We use this model as the base model for molecular-scale finetuning tasks. | | | A model pretrained with 131,072 context using as the initialization. We use this model to reason about and generate sequences at the genome scale. | | | A model fine-tuned on specifically on CRISPR-Cas systems. We use this model to generate Cas9/12/13 systems. | | | A model fine-tuned on specifically on transposons. We use this to generate IS200/IS605. | ### How to use Evo Example usage is provided in the standalone repo. ## Cite",
+ "model_explanation_gemini": "A biological foundation model for long-context genome sequence modeling and design, using StripedHyena architecture for efficient single-nucleotide resolution processing.\n\n**Model Features:** \n- 7B parameters \n- StripedHyena architecture for near-linear scaling \n- Trained on 450B tokens (OpenGenome prokaryotic dataset) \n- 8k context length \n- Single-nucleotide/byte-level resolution \n- Specialized versions for CRISPR-Cas systems and transposons \n\n**Comparison:**"
+}
\ No newline at end of file
diff --git a/model_data_json/facebook_convnextv2-base-22k-384.json b/model_data_json/facebook_convnextv2-base-22k-384.json
new file mode 100644
index 0000000000000000000000000000000000000000..21bf7070926128419933bbfa612bdc25ab1c4b24
--- /dev/null
+++ b/model_data_json/facebook_convnextv2-base-22k-384.json
@@ -0,0 +1,20 @@
+{
+ "model_id": "facebook/convnextv2-base-22k-384",
+ "downloads": 80286,
+ "tags": [
+ "transformers",
+ "pytorch",
+ "safetensors",
+ "convnextv2",
+ "image-classification",
+ "vision",
+ "dataset:imagenet-22k",
+ "arxiv:2301.00808",
+ "license:apache-2.0",
+ "autotrain_compatible",
+ "endpoints_compatible",
+ "region:us"
+ ],
+ "description": "--- license: apache-2.0 tags: - vision - image-classification datasets: - imagenet-22k widget: - src: example_title: Tiger - src: example_title: Teapot - src: example_title: Palace --- # ConvNeXt V2 (base-sized model) ConvNeXt V2 model pretrained using the FCMAE framework and fine-tuned on the ImageNet-22K dataset at resolution 384x384. It was introduced in the paper ConvNeXt V2: Co-designing and Scaling ConvNets with Masked Autoencoders by Woo et al. and first released in this repository. Disclaimer: The team releasing ConvNeXT V2 did not write a model card for this model so this model card has been written by the Hugging Face team. ## Model description ConvNeXt V2 is a pure convolutional model (ConvNet) that introduces a fully convolutional masked autoencoder framework (FCMAE) and a new Global Response Normalization (GRN) layer to ConvNeXt. ConvNeXt V2 significantly improves the performance of pure ConvNets on various recognition benchmarks. !model image ## Intended uses & limitations You can use the raw model for image classification. See the model hub to look for fine-tuned versions on a task that interests you. ### How to use Here is how to use this model to classify an image of the COCO 2017 dataset into one of the 1,000 ImageNet classes: For more code examples, we refer to the documentation. ### BibTeX entry and citation info",
+ "model_explanation_gemini": "\"facebook_convnextv2-base-22k-384 is a convolutional neural network (ConvNet) pretrained with FCMAE and fine-tuned on ImageNet-22K for image classification at 384x384 resolution, featuring Global Response Normalization for improved performance.\"\n\nFeatures: \n- Architecture: ConvNeXt V2 (base-sized) \n- Pretraining: FCMAE (fully convolutional masked autoencoder framework) \n- Fine-tuning: ImageNet-22K dataset"
+}
\ No newline at end of file
diff --git a/model_data_json/facebook_esm2_t30_150M_UR50D.json b/model_data_json/facebook_esm2_t30_150M_UR50D.json
new file mode 100644
index 0000000000000000000000000000000000000000..857c2169d664fbfed1c532f98a77855264df3a97
--- /dev/null
+++ b/model_data_json/facebook_esm2_t30_150M_UR50D.json
@@ -0,0 +1,18 @@
+{
+ "model_id": "facebook/esm2_t30_150M_UR50D",
+ "downloads": 73482,
+ "tags": [
+ "transformers",
+ "pytorch",
+ "tf",
+ "safetensors",
+ "esm",
+ "fill-mask",
+ "license:mit",
+ "autotrain_compatible",
+ "endpoints_compatible",
+ "region:us"
+ ],
+ "description": "--- license: mit widget: - text: \"MQIFVKTLTGKTITLEVEPSTIENVKAKIQDKEGIPPDQQRLIFAGKQLEDGRTLSDYNIQKESTLHLVLRLRGG\" --- ## ESM-2 ESM-2 is a state-of-the-art protein model trained on a masked language modelling objective. It is suitable for fine-tuning on a wide range of tasks that take protein sequences as input. For detailed information on the model architecture and training data, please refer to the accompanying paper. You may also be interested in some demo notebooks (PyTorch, TensorFlow) which demonstrate how to fine-tune ESM-2 models on your tasks of interest. Several ESM-2 checkpoints are available in the Hub with varying sizes. Larger sizes generally have somewhat better accuracy, but require much more memory and time to train: | Checkpoint name | Num layers | Num parameters | |------------------------------|----|----------| | esm2_t48_15B_UR50D | 48 | 15B | | esm2_t36_3B_UR50D | 36 | 3B | | esm2_t33_650M_UR50D | 33 | 650M | | esm2_t30_150M_UR50D | 30 | 150M | | esm2_t12_35M_UR50D | 12 | 35M | | esm2_t6_8M_UR50D | 6 | 8M |",
+ "model_explanation_gemini": "\"ESM-2 is a protein language model trained for masked language modeling, suitable for fine-tuning on protein sequence tasks.\"\n\nModel Features: \n- Masked language modeling objective \n- Trained on protein sequences \n- Supports fine-tuning for various protein-related tasks \n- Available in multiple sizes (8M to 15B parameters) \n\nComparison: \nLarger ESM-2 checkpoints (e.g., esm2_t48_15B_UR50D) offer better accuracy but"
+}
\ No newline at end of file
diff --git a/model_data_json/facebook_opt-13b.json b/model_data_json/facebook_opt-13b.json
new file mode 100644
index 0000000000000000000000000000000000000000..7eb662c422aa837d04c562600430e6df83d35dfc
--- /dev/null
+++ b/model_data_json/facebook_opt-13b.json
@@ -0,0 +1,21 @@
+{
+ "model_id": "facebook/opt-13b",
+ "downloads": 76208,
+ "tags": [
+ "transformers",
+ "pytorch",
+ "tf",
+ "jax",
+ "opt",
+ "text-generation",
+ "en",
+ "arxiv:2205.01068",
+ "arxiv:2005.14165",
+ "license:other",
+ "autotrain_compatible",
+ "text-generation-inference",
+ "region:us"
+ ],
+ "description": "--- language: en inference: false tags: - opt - text-generation license: other commercial: false --- # OPT : Open Pre-trained Transformer Language Models OPT was first introduced in Open Pre-trained Transformer Language Models and first released in metaseq's repository on May 3rd 2022 by Meta AI. **Disclaimer**: The team releasing OPT wrote an official model card, which is available in Appendix D of the paper. Content from **this** model card has been written by the Hugging Face team. ## Intro To quote the first two paragraphs of the official paper > Large language models trained on massive text collections have shown surprising emergent > capabilities to generate text and perform zero- and few-shot learning. While in some cases the public > can interact with these models through paid APIs, full model access is currently limited to only a > few highly resourced labs. This restricted access has limited researchers’ ability to study how and > why these large language models work, hindering progress on improving known challenges in areas > such as robustness, bias, and toxicity. > We present Open Pretrained Transformers (OPT), a suite of decoder-only pre-trained transformers ranging from 125M > to 175B parameters, which we aim to fully and responsibly share with interested researchers. We train the OPT models to roughly match > the performance and sizes of the GPT-3 class of models, while also applying the latest best practices in data > collection and efficient training. Our aim in developing this suite of OPT models is to enable reproducible and responsible research at scale, and > to bring more voices to the table in studying the impact of these LLMs. Definitions of risk, harm, bias, and toxicity, etc., should be articulated by the > collective research community as a whole, which is only possible when models are available for study. ## Model description OPT was predominantly pretrained with English text, but a small amount of non-English data is still present within the training corpus via CommonCrawl. The model was pretrained using a causal language modeling (CLM) objective. OPT belongs to the same family of decoder-only models like GPT-3. As such, it was pretrained using the self-supervised causal language modedling objective. For evaluation, OPT follows GPT-3 by using their prompts and overall experimental setup. For more details, please read the official paper. ## Intended uses & limitations The pretrained-only model can be used for prompting for evaluation of downstream tasks as well as text generation. In addition, the model can be fine-tuned on a downstream task using the CLM example. For all other OPT checkpoints, please have a look at the model hub. ### How to use For large OPT models, such as this one, it is not recommend to make use of the pipeline because one should load the model in half-precision to accelerate generation and optimize memory consumption on GPU. It is recommended to directly call the []( method as follows: By default, generation is deterministic. In order to use the top-k sampling, please set to . ### Limitations and bias As mentioned in Meta AI's model card, given that the training data used for this model contains a lot of unfiltered content from the internet, which is far from neutral the model is strongly biased : > Like other large language models for which the diversity (or lack thereof) of training > data induces downstream impact on the quality of our model, OPT-175B has limitations in terms > of bias and safety. OPT-175B can also have quality issues in terms of generation diversity and > hallucination. In general, OPT-175B is not immune from the plethora of issues that plague modern > large language models. Here's an example of how the model can have biased predictions: compared to: This bias will also affect all fine-tuned versions of this model. ## Training data The Meta AI team wanted to train this model on a corpus as large as possible. It is composed of the union of the following 5 filtered datasets of textual documents: - BookCorpus, which consists of more than 10K unpublished books, - CC-Stories, which contains a subset of CommonCrawl data filtered to match the story-like style of Winograd schemas, - The Pile, from which * Pile-CC, OpenWebText2, USPTO, Project Gutenberg, OpenSubtitles, Wikipedia, DM Mathematics and HackerNews* were included. - Pushshift.io Reddit dataset that was developed in Baumgartner et al. (2020) and processed in Roller et al. (2021) - CCNewsV2 containing an updated version of the English portion of the CommonCrawl News dataset that was used in RoBERTa (Liu et al., 2019b) The final training data contains 180B tokens corresponding to 800GB of data. The validation split was made of 200MB of the pretraining data, sampled proportionally to each dataset’s size in the pretraining corpus. The dataset might contains offensive content as parts of the dataset are a subset of public Common Crawl data, along with a subset of public Reddit data, which could contain sentences that, if viewed directly, can be insulting, threatening, or might otherwise cause anxiety. ### Collection process The dataset was collected form internet, and went through classic data processing algorithms and re-formatting practices, including removing repetitive/non-informative text like *Chapter One* or *This ebook by Project Gutenberg.* ## Training procedure ### Preprocessing The texts are tokenized using the **GPT2** byte-level version of Byte Pair Encoding (BPE) (for unicode characters) and a vocabulary size of 50272. The inputs are sequences of 2048 consecutive tokens. The 175B model was trained on 992 *80GB A100 GPUs*. The training duration was roughly ~33 days of continuous training. ### BibTeX entry and citation info",
+ "model_explanation_gemini": "A decoder-only English language model pretrained for text generation and few-shot learning, designed to match GPT-3's performance while enabling open research access.\n\n**Features:** \n- Decoder-only transformer architecture \n- Pretrained with causal language modeling (CLM) \n- Primarily English text with some non-English data \n- Supports prompting and fine-tuning for downstream tasks \n- Biases inherited from unfiltered internet training data \n\n**Comparison:** \nOPT-13b mirrors GPT-3's capabilities"
+}
\ No newline at end of file
diff --git a/model_data_json/facebook_opt-6.7b.json b/model_data_json/facebook_opt-6.7b.json
new file mode 100644
index 0000000000000000000000000000000000000000..cdb516ffa5d1ab5dcc08ca8cb962c5c6196c4328
--- /dev/null
+++ b/model_data_json/facebook_opt-6.7b.json
@@ -0,0 +1,21 @@
+{
+ "model_id": "facebook/opt-6.7b",
+ "downloads": 69673,
+ "tags": [
+ "transformers",
+ "pytorch",
+ "tf",
+ "jax",
+ "opt",
+ "text-generation",
+ "en",
+ "arxiv:2205.01068",
+ "arxiv:2005.14165",
+ "license:other",
+ "autotrain_compatible",
+ "text-generation-inference",
+ "region:us"
+ ],
+ "description": "--- language: en inference: false tags: - text-generation - opt license: other commercial: false --- # OPT : Open Pre-trained Transformer Language Models OPT was first introduced in Open Pre-trained Transformer Language Models and first released in metaseq's repository on May 3rd 2022 by Meta AI. **Disclaimer**: The team releasing OPT wrote an official model card, which is available in Appendix D of the paper. Content from **this** model card has been written by the Hugging Face team. ## Intro To quote the first two paragraphs of the official paper > Large language models trained on massive text collections have shown surprising emergent > capabilities to generate text and perform zero- and few-shot learning. While in some cases the public > can interact with these models through paid APIs, full model access is currently limited to only a > few highly resourced labs. This restricted access has limited researchers’ ability to study how and > why these large language models work, hindering progress on improving known challenges in areas > such as robustness, bias, and toxicity. > We present Open Pretrained Transformers (OPT), a suite of decoder-only pre-trained transformers ranging from 125M > to 175B parameters, which we aim to fully and responsibly share with interested researchers. We train the OPT models to roughly match > the performance and sizes of the GPT-3 class of models, while also applying the latest best practices in data > collection and efficient training. Our aim in developing this suite of OPT models is to enable reproducible and responsible research at scale, and > to bring more voices to the table in studying the impact of these LLMs. Definitions of risk, harm, bias, and toxicity, etc., should be articulated by the > collective research community as a whole, which is only possible when models are available for study. ## Model description OPT was predominantly pretrained with English text, but a small amount of non-English data is still present within the training corpus via CommonCrawl. The model was pretrained using a causal language modeling (CLM) objective. OPT belongs to the same family of decoder-only models like GPT-3. As such, it was pretrained using the self-supervised causal language modedling objective. For evaluation, OPT follows GPT-3 by using their prompts and overall experimental setup. For more details, please read the official paper. ## Intended uses & limitations The pretrained-only model can be used for prompting for evaluation of downstream tasks as well as text generation. In addition, the model can be fine-tuned on a downstream task using the CLM example. For all other OPT checkpoints, please have a look at the model hub. ### How to use For large OPT models, such as this one, it is not recommend to make use of the pipeline because one should load the model in half-precision to accelerate generation and optimize memory consumption on GPU. It is recommended to directly call the []( method as follows: By default, generation is deterministic. In order to use the top-k sampling, please set to . ### Limitations and bias As mentioned in Meta AI's model card, given that the training data used for this model contains a lot of unfiltered content from the internet, which is far from neutral the model is strongly biased : > Like other large language models for which the diversity (or lack thereof) of training > data induces downstream impact on the quality of our model, OPT-175B has limitations in terms > of bias and safety. OPT-175B can also have quality issues in terms of generation diversity and > hallucination. In general, OPT-175B is not immune from the plethora of issues that plague modern > large language models. Here's an example of how the model can have biased predictions: compared to: This bias will also affect all fine-tuned versions of this model. ## Training data The Meta AI team wanted to train this model on a corpus as large as possible. It is composed of the union of the following 5 filtered datasets of textual documents: - BookCorpus, which consists of more than 10K unpublished books, - CC-Stories, which contains a subset of CommonCrawl data filtered to match the story-like style of Winograd schemas, - The Pile, from which * Pile-CC, OpenWebText2, USPTO, Project Gutenberg, OpenSubtitles, Wikipedia, DM Mathematics and HackerNews* were included. - Pushshift.io Reddit dataset that was developed in Baumgartner et al. (2020) and processed in Roller et al. (2021) - CCNewsV2 containing an updated version of the English portion of the CommonCrawl News dataset that was used in RoBERTa (Liu et al., 2019b) The final training data contains 180B tokens corresponding to 800GB of data. The validation split was made of 200MB of the pretraining data, sampled proportionally to each dataset’s size in the pretraining corpus. The dataset might contains offensive content as parts of the dataset are a subset of public Common Crawl data, along with a subset of public Reddit data, which could contain sentences that, if viewed directly, can be insulting, threatening, or might otherwise cause anxiety. ### Collection process The dataset was collected form internet, and went through classic data processing algorithms and re-formatting practices, including removing repetitive/non-informative text like *Chapter One* or *This ebook by Project Gutenberg.* ## Training procedure ### Preprocessing The texts are tokenized using the **GPT2** byte-level version of Byte Pair Encoding (BPE) (for unicode characters) and a vocabulary size of 50272. The inputs are sequences of 2048 consecutive tokens. The 175B model was trained on 992 *80GB A100 GPUs*. The training duration was roughly ~33 days of continuous training. ### BibTeX entry and citation info",
+ "model_explanation_gemini": "A decoder-only English language model (OPT-6.7B) designed for text generation and zero/few-shot learning, replicating GPT-3's performance while promoting open research access. \n\n**Features:** \n- **Architecture:** Decoder-only transformer (similar to GPT-3) \n- **Training Objective:** Causal language modeling (CLM) \n- **Primary Use:** Text generation, prompting for downstream tasks, and fine-tuning \n- **Data:** Predominantly English with"
+}
\ No newline at end of file
diff --git a/model_data_json/facebook_vit-mae-base.json b/model_data_json/facebook_vit-mae-base.json
new file mode 100644
index 0000000000000000000000000000000000000000..c94ae1a4fa64266f2f44452f78baf3bd60a6c9aa
--- /dev/null
+++ b/model_data_json/facebook_vit-mae-base.json
@@ -0,0 +1,20 @@
+{
+ "model_id": "facebook/vit-mae-base",
+ "downloads": 72697,
+ "tags": [
+ "transformers",
+ "pytorch",
+ "tf",
+ "safetensors",
+ "vit_mae",
+ "pretraining",
+ "vision",
+ "dataset:imagenet-1k",
+ "arxiv:2111.06377",
+ "license:apache-2.0",
+ "endpoints_compatible",
+ "region:us"
+ ],
+ "description": "--- license: apache-2.0 tags: - vision datasets: - imagenet-1k --- # Vision Transformer (base-sized model) pre-trained with MAE Vision Transformer (ViT) model pre-trained using the MAE method. It was introduced in the paper Masked Autoencoders Are Scalable Vision Learners by Kaiming He, Xinlei Chen, Saining Xie, Yanghao Li, Piotr Dollár, Ross Girshick and first released in this repository. Disclaimer: The team releasing MAE did not write a model card for this model so this model card has been written by the Hugging Face team. ## Model description The Vision Transformer (ViT) is a transformer encoder model (BERT-like). Images are presented to the model as a sequence of fixed-size patches. During pre-training, one randomly masks out a high portion (75%) of the image patches. First, the encoder is used to encode the visual patches. Next, a learnable (shared) mask token is added at the positions of the masked patches. The decoder takes the encoded visual patches and mask tokens as input and reconstructs raw pixel values for the masked positions. By pre-training the model, it learns an inner representation of images that can then be used to extract features useful for downstream tasks: if you have a dataset of labeled images for instance, you can train a standard classifier by placing a linear layer on top of the pre-trained encoder. ## Intended uses & limitations You can use the raw model for image classification. See the model hub to look for fine-tuned versions on a task that interests you. ### How to use Here is how to use this model: ### BibTeX entry and citation info",
+ "model_explanation_gemini": "\"Facebook's ViT-MAE-base is a vision transformer model pre-trained with masked autoencoding (MAE) to reconstruct image patches, enabling downstream tasks like image classification.\"\n\nFeatures: \n- Base-sized Vision Transformer (ViT) architecture \n- Pre-trained using MAE (Masked Autoencoder) method \n- 75% of image patches masked during pre-training \n- Encoder-decoder structure for patch reconstruction \n- Suitable for transfer learning on image-related tasks \n\nComparison: \nUnlike standard"
+}
\ No newline at end of file
diff --git a/model_data_json/facebook_wav2vec2-large.json b/model_data_json/facebook_wav2vec2-large.json
new file mode 100644
index 0000000000000000000000000000000000000000..a14187a79099298727b21b085b3bc913036a713b
--- /dev/null
+++ b/model_data_json/facebook_wav2vec2-large.json
@@ -0,0 +1,19 @@
+{
+ "model_id": "facebook/wav2vec2-large",
+ "downloads": 82081,
+ "tags": [
+ "transformers",
+ "pytorch",
+ "wav2vec2",
+ "pretraining",
+ "speech",
+ "en",
+ "dataset:librispeech_asr",
+ "arxiv:2006.11477",
+ "license:apache-2.0",
+ "endpoints_compatible",
+ "region:us"
+ ],
+ "description": "--- language: en datasets: - librispeech_asr tags: - speech license: apache-2.0 --- # Wav2Vec2-Large Facebook's Wav2Vec2 The base model pretrained on 16kHz sampled speech audio. When using the model make sure that your speech input is also sampled at 16Khz. Note that this model should be fine-tuned on a downstream task, like Automatic Speech Recognition. Check out this blog for more information. Paper Authors: Alexei Baevski, Henry Zhou, Abdelrahman Mohamed, Michael Auli **Abstract** We show for the first time that learning powerful representations from speech audio alone followed by fine-tuning on transcribed speech can outperform the best semi-supervised methods while being conceptually simpler. wav2vec 2.0 masks the speech input in the latent space and solves a contrastive task defined over a quantization of the latent representations which are jointly learned. Experiments using all labeled data of Librispeech achieve 1.8/3.3 WER on the clean/other test sets. When lowering the amount of labeled data to one hour, wav2vec 2.0 outperforms the previous state of the art on the 100 hour subset while using 100 times less labeled data. Using just ten minutes of labeled data and pre-training on 53k hours of unlabeled data still achieves 4.8/8.2 WER. This demonstrates the feasibility of speech recognition with limited amounts of labeled data. The original model can be found under # Usage See this notebook for more information on how to fine-tune the model.",
+ "model_explanation_gemini": "\"facebook_wav2vec2-large is a pretrained speech model for 16kHz audio, designed for fine-tuning on tasks like automatic speech recognition, leveraging self-supervised learning to achieve high accuracy with limited labeled data.\"\n\nFeatures: \n- Pretrained on 16kHz speech audio \n- Requires fine-tuning for downstream tasks (e.g., ASR) \n- Uses self-supervised learning with latent space masking and contrastive tasks \n- Achieves low WER (1.8/3"
+}
\ No newline at end of file
diff --git a/model_data_json/foduucom_table-detection-and-extraction.json b/model_data_json/foduucom_table-detection-and-extraction.json
new file mode 100644
index 0000000000000000000000000000000000000000..de95e29d2af57262162ad638ea7b588902bf1595
--- /dev/null
+++ b/model_data_json/foduucom_table-detection-and-extraction.json
@@ -0,0 +1,30 @@
+{
+ "model_id": "foduucom/table-detection-and-extraction",
+ "downloads": 83150,
+ "tags": [
+ "ultralytics",
+ "tensorboard",
+ "v8",
+ "ultralyticsplus",
+ "yolov8",
+ "yolo",
+ "vision",
+ "object-detection",
+ "pytorch",
+ "table detection",
+ "table extraction",
+ "table classification",
+ "document analysis",
+ "unstructured document",
+ "unstructured table extraction",
+ "structured table extraction",
+ "unstructured table detection",
+ "structured table detection",
+ "en",
+ "dataset:foduucom/table-detection-yolo",
+ "model-index",
+ "region:us"
+ ],
+ "description": "--- tags: - ultralyticsplus - yolov8 - ultralytics - yolo - vision - object-detection - pytorch - table detection - table extraction - table classification - document analysis - unstructured document - unstructured table extraction - structured table extraction - unstructured table detection - structured table detection library_name: ultralytics library_version: 8.0.43 inference: true model-index: - name: foduucom/table-detection-and-extraction results: - task: type: object-detection metrics: - type: precision value: 0.96196 name: mAP@0.5(box) language: - en metrics: - accuracy datasets: - foduucom/table-detection-yolo pipeline_tag: object-detection ---
# Model Card for YOLOv8s Table Detection ## Model Summary The YOLOv8s Table Detection model is an object detection model based on the YOLO (You Only Look Once) framework. It is designed to detect tables, whether they are bordered or borderless, in images. The model has been fine-tuned on a vast dataset and achieved high accuracy in detecting tables and distinguishing between bordered and borderless ones. ## Model Details ### Model Description The YOLOv8s Table Detection model serves as a versatile solution for precisely identifying tables within images, whether they exhibit a bordered or borderless design. Notably, this model's capabilities extend beyond mere detection – it plays a crucial role in addressing the complexities of unstructured documents. By employing advanced techniques such as bounding box delineation, the model enables users to isolate tables of interest within the visual content. What sets this model apart is its synergy with Optical Character Recognition (OCR) technology. This seamless integration empowers the model to not only locate tables but also to extract pertinent data contained within. The bounding box information guides the cropping of tables, which is then coupled with OCR to meticulously extract textual data, streamlining the process of information retrieval from unstructured documents. We invite you to explore the potential of this model and its data extraction capabilities. For those interested in harnessing its power or seeking further collaboration, we encourage you to reach out to us at info@foduu.com. Whether you require assistance, customization, or have innovative ideas, our collaborative approach is geared towards addressing your unique challenges. Additionally, you can actively engage with our vibrant community section for valuable insights and collective problem-solving. Your input drives our continuous improvement, as we collectively pave the way towards enhanced data extraction and document analysis. - **Developed by:** FODUU AI - **Model type:** Object Detection - **Task:** Table Detection (Bordered and Borderless) Furthermore, the YOLOv8s Table Detection model is not limited to table detection alone. It is a versatile tool that contributes to the processing of unstructured documents. By utilizing advanced bounding box techniques, the model empowers users to isolate tables within the document's visual content. What sets this model apart is its seamless integration with Optical Character Recognition (OCR) technology. The combination of bounding box information and OCR allows for precise data extraction from the tables. This comprehensive approach streamlines the process of information retrieval from complex documents. User collaboration is actively encouraged to enrich the model's capabilities. By contributing table images of different designs and types, users play a pivotal role in enhancing the model's ability to detect a diverse range of tables accurately. Community participation can be facilitated through our platform or by reaching out to us at info@foduu.com. We value collaborative efforts that drive continuous improvement and innovation in table detection and extraction. ### Supported Labels ## Uses ### Direct Use The YOLOv8s Table Detection model can be directly used for detecting tables in images, whether they are bordered or borderless. It is equipped with the ability to distinguish between these two categories. ### Downstream Use The model can also be fine-tuned for specific table detection tasks or integrated into larger applications for furniture recognition, interior design, image-based data extraction, and other related fields. ### Out-of-Scope Use The model is not designed for unrelated object detection tasks or scenarios outside the scope of table detection. ## Bias, Risks, and Limitations The YOLOv8s Table Detection model may have some limitations and biases: - Performance may vary based on the quality, diversity, and representativeness of the training data. - The model may face challenges in detecting tables with intricate designs or complex arrangements. - Accuracy may be affected by variations in lighting conditions, image quality, and resolution. - Detection of very small or distant tables might be less accurate. - The model's ability to classify bordered and borderless tables may be influenced by variations in design. ### Recommendations Users should be informed about the model's limitations and potential biases. Further testing and validation are advised for specific use cases to evaluate its performance accurately. ## How to Get Started with the Model To begin using the YOLOv8s Table Detection model, follow these steps: - Load model and perform prediction: ## Training Details ### Training Data The model is trained on a diverse dataset containing images of tables from various sources. The dataset includes examples of both bordered and borderless tables, capturing different designs and styles. ### Training Procedure The training process involves extensive computation and is conducted over multiple epochs. The model's weights are adjusted to minimize detection loss and optimize performance. #### Metrics - mAP@0.5 (box): - All: 0.962 - Bordered: 0.961 - Borderless: 0.963 ### Model Architecture and Objective The YOLOv8s architecture employs a modified CSPDarknet53 as its backbone, along with self-attention mechanisms and feature pyramid networks. These components contribute to the model's ability to detect and classify tables accurately, considering variations in size, design, and style. ### Compute Infrastructure #### Hardware NVIDIA GeForce RTX 3060 card #### Software The model was trained and fine-tuned using a Jupyter Notebook environment. ## Model Card Contact For inquiries and contributions, please contact us at info@foduu.com. ---",
+ "model_explanation_gemini": "Detects and extracts tables (bordered/borderless) from images using YOLOv8, integrating OCR for data retrieval from unstructured documents.\n\nModel Features: \n- Object detection (YOLOv8-based) \n- Handles both bordered and borderless tables \n- Bounding box generation for table isolation \n- OCR integration for structured data extraction \n- Fine-tuned for high precision (mAP@0.5: 0.962) \n\nComparison: \nOutperforms generic object"
+}
\ No newline at end of file
diff --git a/model_data_json/fxmarty_pix2struct-tiny-random.json b/model_data_json/fxmarty_pix2struct-tiny-random.json
new file mode 100644
index 0000000000000000000000000000000000000000..6723b38e9b62300e98f818ab182247a29c3b86dc
--- /dev/null
+++ b/model_data_json/fxmarty_pix2struct-tiny-random.json
@@ -0,0 +1,16 @@
+{
+ "model_id": "fxmarty/pix2struct-tiny-random",
+ "downloads": 76445,
+ "tags": [
+ "transformers",
+ "pytorch",
+ "pix2struct",
+ "image-text-to-text",
+ "image-to-text",
+ "license:mit",
+ "endpoints_compatible",
+ "region:us"
+ ],
+ "description": "--- license: mit pipeline_tag: image-to-text ---",
+ "model_explanation_gemini": "Generates text from images, licensed under MIT.\n\nFeatures: \n- Image-to-text conversion \n- MIT license \n\nComparison: Not applicable (no comparative details provided in the description)."
+}
\ No newline at end of file
diff --git a/model_data_json/gokaygokay_Florence-2-SD3-Captioner.json b/model_data_json/gokaygokay_Florence-2-SD3-Captioner.json
new file mode 100644
index 0000000000000000000000000000000000000000..fa3756eaa3385729330f88e7b85eac2244d9ad1f
--- /dev/null
+++ b/model_data_json/gokaygokay_Florence-2-SD3-Captioner.json
@@ -0,0 +1,22 @@
+{
+ "model_id": "gokaygokay/Florence-2-SD3-Captioner",
+ "downloads": 76570,
+ "tags": [
+ "transformers",
+ "safetensors",
+ "florence2",
+ "text-generation",
+ "art",
+ "image-text-to-text",
+ "custom_code",
+ "en",
+ "dataset:google/docci",
+ "dataset:google/imageinwords",
+ "dataset:ProGamerGov/synthetic-dataset-1m-dalle3-high-quality-captions",
+ "license:apache-2.0",
+ "autotrain_compatible",
+ "region:us"
+ ],
+ "description": "--- license: apache-2.0 datasets: - google/docci - google/imageinwords - ProGamerGov/synthetic-dataset-1m-dalle3-high-quality-captions language: - en library_name: transformers pipeline_tag: image-text-to-text tags: - art ---",
+ "model_explanation_gemini": "Generates text captions for images using the Florence-2-SD3 architecture, trained on datasets like DocCI, ImageInWords, and synthetic DALL-E 3 captions. \n\n**Features:** \n- Image-to-text captioning \n- Trained on diverse datasets (DocCI, ImageInWords, synthetic DALL-E 3 captions) \n- Supports English language \n- Apache 2.0 license \n\n**Comparison (if inferred):** \nNo explicit comparison is made"
+}
\ No newline at end of file
diff --git a/model_data_json/google_gemma-1.1-2b-it.json b/model_data_json/google_gemma-1.1-2b-it.json
new file mode 100644
index 0000000000000000000000000000000000000000..52e4f5f91457cd1010b9b378d33dae315c9e66e7
--- /dev/null
+++ b/model_data_json/google_gemma-1.1-2b-it.json
@@ -0,0 +1,36 @@
+{
+ "model_id": "google/gemma-1.1-2b-it",
+ "downloads": 73021,
+ "tags": [
+ "transformers",
+ "safetensors",
+ "gemma",
+ "text-generation",
+ "conversational",
+ "arxiv:2312.11805",
+ "arxiv:2009.03300",
+ "arxiv:1905.07830",
+ "arxiv:1911.11641",
+ "arxiv:1904.09728",
+ "arxiv:1905.10044",
+ "arxiv:1907.10641",
+ "arxiv:1811.00937",
+ "arxiv:1809.02789",
+ "arxiv:1911.01547",
+ "arxiv:1705.03551",
+ "arxiv:2107.03374",
+ "arxiv:2108.07732",
+ "arxiv:2110.14168",
+ "arxiv:2304.06364",
+ "arxiv:2206.04615",
+ "arxiv:1804.06876",
+ "arxiv:2110.08193",
+ "license:gemma",
+ "autotrain_compatible",
+ "text-generation-inference",
+ "endpoints_compatible",
+ "region:us"
+ ],
+ "description": "--- library_name: transformers license: gemma widget: - messages: - role: user content: How does the brain work? inference: parameters: max_new_tokens: 200 extra_gated_heading: Access Gemma on Hugging Face extra_gated_prompt: To access Gemma on Hugging Face, you’re required to review and agree to Google’s usage license. To do this, please ensure you’re logged-in to Hugging Face and click below. Requests are processed immediately. extra_gated_button_content: Acknowledge license --- # Gemma Model Card **Model Page**: Gemma This model card corresponds to the latest 2B instruct version of the Gemma model. Here you can find other models in the Gemma family: | | Base | Instruct | |----|----------------------------------------------------|----------------------------------------------------------------------| | 2B | gemma-2b | **gemma-1.1-2b-it** | | 7B | gemma-7b | gemma-1.1-7b-it | **Release Notes** This is Gemma 1.1 2B (IT), an update over the original instruction-tuned Gemma release. Gemma 1.1 was trained using a novel RLHF method, leading to substantial gains on quality, coding capabilities, factuality, instruction following and multi-turn conversation quality. We also fixed a bug in multi-turn conversations, and made sure that model responses don't always start with . We believe this release represents an improvement for most use cases, but we encourage users to test in their particular applications. The previous model will continue to be available in the same repo. We appreciate the enthusiastic adoption of Gemma, and we continue to welcome all feedback from the community. **Resources and Technical Documentation**: * Responsible Generative AI Toolkit * Gemma on Kaggle * Gemma on Vertex Model Garden **Terms of Use**: Terms **Authors**: Google ## Model Information Summary description and brief definition of inputs and outputs. ### Description Gemma is a family of lightweight, state-of-the-art open models from Google, built from the same research and technology used to create the Gemini models. They are text-to-text, decoder-only large language models, available in English, with open weights, pre-trained variants, and instruction-tuned variants. Gemma models are well-suited for a variety of text generation tasks, including question answering, summarization, and reasoning. Their relatively small size makes it possible to deploy them in environments with limited resources such as a laptop, desktop or your own cloud infrastructure, democratizing access to state of the art AI models and helping foster innovation for everyone. ### Usage Below we share some code snippets on how to get quickly started with running the model. First make sure to , then copy the snippet from the section that is relevant for your usecase. #### Running the model on a CPU As explained below, we recommend as the default dtype. You can use a different precision if necessary. #### Running the model on a single / multi GPU #### Running the model on a GPU using different precisions The native weights of this model were exported in precision. You can use , which may be faster on certain hardware, indicating the when loading the model. For convenience, the revision of the repo contains a copy of the weights already converted to that precision. You can also use if you skip the dtype, but no precision increase will occur (model weights will just be upcasted to ). See examples below. * _Using _ * _Using _ * _Upcasting to _ #### Quantized Versions through * _Using 8-bit precision (int8)_ * _Using 4-bit precision_ #### Other optimizations * _Flash Attention 2_ First make sure to install in your environment #### Running the model in JAX / Flax Use the branch of the repository: Check this notebook for a comprehensive walkthrough on how to parallelize JAX inference. ### Chat Template The instruction-tuned models use a chat template that must be adhered to for conversational use. The easiest way to apply it is using the tokenizer's built-in chat template, as shown in the following snippet. Let's load the model and apply the chat template to a conversation. In this example, we'll start with a single user interaction: At this point, the prompt contains the following text: As you can see, each turn is preceded by a delimiter and then the role of the entity (either , for content supplied by the user, or for LLM responses). Turns finish with the token. You can follow this format to build the prompt manually, if you need to do it without the tokenizer's chat template. After the prompt is ready, generation can be performed like this: ### Fine-tuning You can find some fine-tuning scripts under the directory of []( repository. To adapt them to this model, simply change the model-id to . We provide: * A script to perform Supervised Fine-Tuning (SFT) on UltraChat dataset using QLoRA * A script to perform SFT using FSDP on TPU devices * A notebook that you can run on a free-tier Google Colab instance to perform SFT on the English quotes dataset ### Inputs and outputs * **Input:** Text string, such as a question, a prompt, or a document to be summarized. * **Output:** Generated English-language text in response to the input, such as an answer to a question, or a summary of a document. ## Model Data Data used for model training and how the data was processed. ### Training Dataset These models were trained on a dataset of text data that includes a wide variety of sources, totaling 6 trillion tokens. Here are the key components: * Web Documents: A diverse collection of web text ensures the model is exposed to a broad range of linguistic styles, topics, and vocabulary. Primarily English-language content. * Code: Exposing the model to code helps it to learn the syntax and patterns of programming languages, which improves its ability to generate code or understand code-related questions. * Mathematics: Training on mathematical text helps the model learn logical reasoning, symbolic representation, and to address mathematical queries. The combination of these diverse data sources is crucial for training a powerful language model that can handle a wide variety of different tasks and text formats. ### Data Preprocessing Here are the key data cleaning and filtering methods applied to the training data: * CSAM Filtering: Rigorous CSAM (Child Sexual Abuse Material) filtering was applied at multiple stages in the data preparation process to ensure the exclusion of harmful and illegal content * Sensitive Data Filtering: As part of making Gemma pre-trained models safe and reliable, automated techniques were used to filter out certain personal information and other sensitive data from training sets. * Additional methods: Filtering based on content quality and safely in line with our policies. ## Implementation Information Details about the model internals. ### Hardware Gemma was trained using the latest generation of Tensor Processing Unit (TPU) hardware (TPUv5e). Training large language models requires significant computational power. TPUs, designed specifically for matrix operations common in machine learning, offer several advantages in this domain: * Performance: TPUs are specifically designed to handle the massive computations involved in training LLMs. They can speed up training considerably compared to CPUs. * Memory: TPUs often come with large amounts of high-bandwidth memory, allowing for the handling of large models and batch sizes during training. This can lead to better model quality. * Scalability: TPU Pods (large clusters of TPUs) provide a scalable solution for handling the growing complexity of large foundation models. You can distribute training across multiple TPU devices for faster and more efficient processing. * Cost-effectiveness: In many scenarios, TPUs can provide a more cost-effective solution for training large models compared to CPU-based infrastructure, especially when considering the time and resources saved due to faster training. * These advantages are aligned with Google's commitments to operate sustainably. ### Software Training was done using JAX and ML Pathways. JAX allows researchers to take advantage of the latest generation of hardware, including TPUs, for faster and more efficient training of large models. ML Pathways is Google's latest effort to build artificially intelligent systems capable of generalizing across multiple tasks. This is specially suitable for foundation models, including large language models like these ones. Together, JAX and ML Pathways are used as described in the paper about the Gemini family of models; \"the 'single controller' programming model of Jax and Pathways allows a single Python process to orchestrate the entire training run, dramatically simplifying the development workflow.\" ## Evaluation Model evaluation metrics and results. ### Benchmark Results The pre-trained base models were evaluated against a large collection of different datasets and metrics to cover different aspects of text generation: | Benchmark | Metric | 2B Params | 7B Params | | ------------------------------ | ------------- | ----------- | --------- | | MMLU | 5-shot, top-1 | 42.3 | 64.3 | | HellaSwag | 0-shot |71.4 | 81.2 | | PIQA | 0-shot | 77.3 | 81.2 | | SocialIQA | 0-shot | 49.7 | 51.8 | | BooIQ | 0-shot | 69.4 | 83.2 | | WinoGrande | partial score | 65.4 | 72.3 | | CommonsenseQA | 7-shot | 65.3 | 71.3 | | OpenBookQA | | 47.8 | 52.8 | | ARC-e | | 73.2 | 81.5 | | ARC-c | | 42.1 | 53.2 | | TriviaQA | 5-shot | 53.2 | 63.4 | | Natural Questions | 5-shot | 12.5 | 23 | | HumanEval | pass@1 | 22.0 | 32.3 | | MBPP | 3-shot | 29.2 | 44.4 | | GSM8K | maj@1 | 17.7 | 46.4 | | MATH | 4-shot | 11.8 | 24.3 | | AGIEval | | 24.2 | 41.7 | | BIG-Bench | | 35.2 | 55.1 | | ------------------------------ | ------------- | ----------- | --------- | | **Average** | | **45.0** | **56.9** | ## Ethics and Safety Ethics and safety evaluation approach and results. ### Evaluation Approach Our evaluation methods include structured evaluations and internal red-teaming testing of relevant content policies. Red-teaming was conducted by a number of different teams, each with different goals and human evaluation metrics. These models were evaluated against a number of different categories relevant to ethics and safety, including: * Text-to-Text Content Safety: Human evaluation on prompts covering safety policies including child sexual abuse and exploitation, harassment, violence and gore, and hate speech. * Text-to-Text Representational Harms: Benchmark against relevant academic datasets such as WinoBias and BBQ Dataset. * Memorization: Automated evaluation of memorization of training data, including the risk of personally identifiable information exposure. * Large-scale harm: Tests for \"dangerous capabilities,\" such as chemical, biological, radiological, and nuclear (CBRN) risks. ### Evaluation Results The results of ethics and safety evaluations are within acceptable thresholds for meeting internal policies for categories such as child safety, content safety, representational harms, memorization, large-scale harms. On top of robust internal evaluations, the results of well known safety benchmarks like BBQ, BOLD, Winogender, Winobias, RealToxicity, and TruthfulQA are shown here. #### Gemma 1.0 | Benchmark | Metric | Gemma 1.0 IT 2B | Gemma 1.0 IT 7B | | ------------------------ | ------------- | --------------- | --------------- | | [RealToxicity][realtox] | average | 6.86 | 7.90 | | [BOLD][bold] | | 45.57 | 49.08 | | [CrowS-Pairs][crows] | top-1 | 45.82 | 51.33 | | [BBQ Ambig][bbq] | 1-shot, top-1 | 62.58 | 92.54 | | [BBQ Disambig][bbq] | top-1 | 54.62 | 71.99 | | [Winogender][winogender] | top-1 | 51.25 | 54.17 | | [TruthfulQA][truthfulqa] | | 44.84 | 31.81 | | [Winobias 1_2][winobias] | | 56.12 | 59.09 | | [Winobias 2_2][winobias] | | 91.10 | 92.23 | | [Toxigen][toxigen] | | 29.77 | 39.59 | | ------------------------ | ------------- | --------------- | --------------- | #### Gemma 1.1 | Benchmark | Metric | Gemma 1.1 IT 2B | Gemma 1.1 IT 7B | | ------------------------ | ------------- | --------------- | --------------- | | [RealToxicity][realtox] | average | 7.03 | 8.04 | | [BOLD][bold] | | 47.76 | | | [CrowS-Pairs][crows] | top-1 | 45.89 | 49.67 | | [BBQ Ambig][bbq] | 1-shot, top-1 | 58.97 | 86.06 | | [BBQ Disambig][bbq] | top-1 | 53.90 | 85.08 | | [Winogender][winogender] | top-1 | 50.14 | 57.64 | | [TruthfulQA][truthfulqa] | | 44.24 | 45.34 | | [Winobias 1_2][winobias] | | 55.93 | 59.22 | | [Winobias 2_2][winobias] | | 89.46 | 89.2 | | [Toxigen][toxigen] | | 29.64 | 38.75 | | ------------------------ | ------------- | --------------- | --------------- | ## Usage and Limitations These models have certain limitations that users should be aware of. ### Intended Usage Open Large Language Models (LLMs) have a wide range of applications across various industries and domains. The following list of potential uses is not comprehensive. The purpose of this list is to provide contextual information about the possible use-cases that the model creators considered as part of model training and development. * Content Creation and Communication * Text Generation: These models can be used to generate creative text formats such as poems, scripts, code, marketing copy, and email drafts. * Chatbots and Conversational AI: Power conversational interfaces for customer service, virtual assistants, or interactive applications. * Text Summarization: Generate concise summaries of a text corpus, research papers, or reports. * Research and Education * Natural Language Processing (NLP) Research: These models can serve as a foundation for researchers to experiment with NLP techniques, develop algorithms, and contribute to the advancement of the field. * Language Learning Tools: Support interactive language learning experiences, aiding in grammar correction or providing writing practice. * Knowledge Exploration: Assist researchers in exploring large bodies of text by generating summaries or answering questions about specific topics. ### Limitations * Training Data * The quality and diversity of the training data significantly influence the model's capabilities. Biases or gaps in the training data can lead to limitations in the model's responses. * The scope of the training dataset determines the subject areas the model can handle effectively. * Context and Task Complexity * LLMs are better at tasks that can be framed with clear prompts and instructions. Open-ended or highly complex tasks might be challenging. * A model's performance can be influenced by the amount of context provided (longer context generally leads to better outputs, up to a certain point). * Language Ambiguity and Nuance * Natural language is inherently complex. LLMs might struggle to grasp subtle nuances, sarcasm, or figurative language. * Factual Accuracy * LLMs generate responses based on information they learned from their training datasets, but they are not knowledge bases. They may generate incorrect or outdated factual statements. * Common Sense * LLMs rely on statistical patterns in language. They might lack the ability to apply common sense reasoning in certain situations. ### Ethical Considerations and Risks The development of large language models (LLMs) raises several ethical concerns. In creating an open model, we have carefully considered the following: * Bias and Fairness * LLMs trained on large-scale, real-world text data can reflect socio-cultural biases embedded in the training material. These models underwent careful scrutiny, input data pre-processing described and posterior evaluations reported in this card. * Misinformation and Misuse * LLMs can be misused to generate text that is false, misleading, or harmful. * Guidelines are provided for responsible use with the model, see the Responsible Generative AI Toolkit. * Transparency and Accountability: * This model card summarizes details on the models' architecture, capabilities, limitations, and evaluation processes. * A responsibly developed open model offers the opportunity to share innovation by making LLM technology accessible to developers and researchers across the AI ecosystem. Risks identified and mitigations: * Perpetuation of biases: It's encouraged to perform continuous monitoring (using evaluation metrics, human review) and the exploration of de-biasing techniques during model training, fine-tuning, and other use cases. * Generation of harmful content: Mechanisms and guidelines for content safety are essential. Developers are encouraged to exercise caution and implement appropriate content safety safeguards based on their specific product policies and application use cases. * Misuse for malicious purposes: Technical limitations and developer and end-user education can help mitigate against malicious applications of LLMs. Educational resources and reporting mechanisms for users to flag misuse are provided. Prohibited uses of Gemma models are outlined in the Gemma Prohibited Use Policy. * Privacy violations: Models were trained on data filtered for removal of PII (Personally Identifiable Information). Developers are encouraged to adhere to privacy regulations with privacy-preserving techniques. ### Benefits At the time of release, this family of models provides high-performance open large language model implementations designed from the ground up for Responsible AI development compared to similarly sized models. Using the benchmark evaluation metrics described in this document, these models have shown to provide superior performance to other, comparably-sized open model alternatives.",
+ "model_explanation_gemini": "A lightweight, instruction-tuned text generation model optimized for tasks like question answering, summarization, and reasoning, with improved quality and coding capabilities over its predecessor.\n\n**Features**: \n- Instruction-tuned for conversational use \n- Supports text generation tasks (QA, summarization, reasoning) \n- Optimized for limited-resource environments (CPU/GPU) \n- Improved factuality and multi-turn conversation quality \n- Available in 2B parameter size \n\n**Comparison**: \nThe 1.1"
+}
\ No newline at end of file
diff --git a/model_data_json/google_gemma-2-27b-it.json b/model_data_json/google_gemma-2-27b-it.json
new file mode 100644
index 0000000000000000000000000000000000000000..ad511c72135c6b33810dc886beaf8aad8e6b5042
--- /dev/null
+++ b/model_data_json/google_gemma-2-27b-it.json
@@ -0,0 +1,43 @@
+{
+ "model_id": "google/gemma-2-27b-it",
+ "downloads": 135109,
+ "tags": [
+ "transformers",
+ "safetensors",
+ "gemma2",
+ "text-generation",
+ "conversational",
+ "arxiv:2009.03300",
+ "arxiv:1905.07830",
+ "arxiv:1911.11641",
+ "arxiv:1904.09728",
+ "arxiv:1905.10044",
+ "arxiv:1907.10641",
+ "arxiv:1811.00937",
+ "arxiv:1809.02789",
+ "arxiv:1911.01547",
+ "arxiv:1705.03551",
+ "arxiv:2107.03374",
+ "arxiv:2108.07732",
+ "arxiv:2110.14168",
+ "arxiv:2009.11462",
+ "arxiv:2101.11718",
+ "arxiv:2110.08193",
+ "arxiv:1804.09301",
+ "arxiv:2109.07958",
+ "arxiv:1804.06876",
+ "arxiv:2103.03874",
+ "arxiv:2304.06364",
+ "arxiv:2206.04615",
+ "arxiv:2203.09509",
+ "base_model:google/gemma-2-27b",
+ "base_model:finetune:google/gemma-2-27b",
+ "license:gemma",
+ "autotrain_compatible",
+ "text-generation-inference",
+ "endpoints_compatible",
+ "region:us"
+ ],
+ "description": "--- license: gemma library_name: transformers pipeline_tag: text-generation extra_gated_heading: Access Gemma on Hugging Face extra_gated_prompt: >- To access Gemma on Hugging Face, you’re required to review and agree to Google’s usage license. To do this, please ensure you’re logged in to Hugging Face and click below. Requests are processed immediately. extra_gated_button_content: Acknowledge license base_model: google/gemma-2-27b --- # Gemma 2 model card **Model Page**: Gemma **Resources and Technical Documentation**: * [Responsible Generative AI Toolkit][rai-toolkit] * [Gemma on Kaggle][kaggle-gemma] * [Gemma on Vertex Model Garden][vertex-mg-gemma] **Terms of Use**: Terms **Authors**: Google ## Model Information Summary description and brief definition of inputs and outputs. ### Description Gemma is a family of lightweight, state-of-the-art open models from Google, built from the same research and technology used to create the Gemini models. They are text-to-text, decoder-only large language models, available in English, with open weights for both pre-trained variants and instruction-tuned variants. Gemma models are well-suited for a variety of text generation tasks, including question answering, summarization, and reasoning. Their relatively small size makes it possible to deploy them in environments with limited resources such as a laptop, desktop or your own cloud infrastructure, democratizing access to state of the art AI models and helping foster innovation for everyone. ### Usage Below we share some code snippets on how to get quickly started with running the model. First, install the Transformers library with: Then, copy the snippet from the section that is relevant for your usecase. #### Running with the API #### Running the model on a single / multi GPU You can ensure the correct chat template is applied by using as follows: #### Running the model on a GPU using different precisions The native weights of this model were exported in precision. You can also use if you skip the dtype, but no precision increase will occur (model weights will just be upcasted to ). See examples below. * _Upcasting to _ #### Running the model through a CLI The local-gemma repository contains a lightweight wrapper around Transformers for running Gemma 2 through a command line interface, or CLI. Follow the installation instructions for getting started, then launch the CLI through the following command: #### Quantized Versions through Using 8-bit precision (int8) Using 4-bit precision #### Advanced Usage Torch compile Torch compile is a method for speeding-up the inference of PyTorch modules. The Gemma-2 model can be run up to 6x faster by leveraging torch compile. Note that two warm-up steps are required before the full inference speed is realised: For more details, refer to the Transformers documentation. ### Chat Template The instruction-tuned models use a chat template that must be adhered to for conversational use. The easiest way to apply it is using the tokenizer's built-in chat template, as shown in the following snippet. Let's load the model and apply the chat template to a conversation. In this example, we'll start with a single user interaction: At this point, the prompt contains the following text: As you can see, each turn is preceded by a delimiter and then the role of the entity (either , for content supplied by the user, or for LLM responses). Turns finish with the token. You can follow this format to build the prompt manually, if you need to do it without the tokenizer's chat template. After the prompt is ready, generation can be performed like this: ### Inputs and outputs * **Input:** Text string, such as a question, a prompt, or a document to be summarized. * **Output:** Generated English-language text in response to the input, such as an answer to a question, or a summary of a document. ### Citation ## Model Data Data used for model training and how the data was processed. ### Training Dataset These models were trained on a dataset of text data that includes a wide variety of sources. The 27B model was trained with 13 trillion tokens and the 9B model was trained with 8 trillion tokens. Here are the key components: * Web Documents: A diverse collection of web text ensures the model is exposed to a broad range of linguistic styles, topics, and vocabulary. Primarily English-language content. * Code: Exposing the model to code helps it to learn the syntax and patterns of programming languages, which improves its ability to generate code or understand code-related questions. * Mathematics: Training on mathematical text helps the model learn logical reasoning, symbolic representation, and to address mathematical queries. The combination of these diverse data sources is crucial for training a powerful language model that can handle a wide variety of different tasks and text formats. ### Data Preprocessing Here are the key data cleaning and filtering methods applied to the training data: * CSAM Filtering: Rigorous CSAM (Child Sexual Abuse Material) filtering was applied at multiple stages in the data preparation process to ensure the exclusion of harmful and illegal content. * Sensitive Data Filtering: As part of making Gemma pre-trained models safe and reliable, automated techniques were used to filter out certain personal information and other sensitive data from training sets. * Additional methods: Filtering based on content quality and safety in line with [our policies][safety-policies]. ## Implementation Information Details about the model internals. ### Hardware Gemma was trained using the latest generation of [Tensor Processing Unit (TPU)][tpu] hardware (TPUv5p). Training large language models requires significant computational power. TPUs, designed specifically for matrix operations common in machine learning, offer several advantages in this domain: * Performance: TPUs are specifically designed to handle the massive computations involved in training LLMs. They can speed up training considerably compared to CPUs. * Memory: TPUs often come with large amounts of high-bandwidth memory, allowing for the handling of large models and batch sizes during training. This can lead to better model quality. * Scalability: TPU Pods (large clusters of TPUs) provide a scalable solution for handling the growing complexity of large foundation models. You can distribute training across multiple TPU devices for faster and more efficient processing. * Cost-effectiveness: In many scenarios, TPUs can provide a more cost-effective solution for training large models compared to CPU-based infrastructure, especially when considering the time and resources saved due to faster training. * These advantages are aligned with [Google's commitments to operate sustainably][sustainability]. ### Software Training was done using [JAX][jax] and [ML Pathways][ml-pathways]. JAX allows researchers to take advantage of the latest generation of hardware, including TPUs, for faster and more efficient training of large models. ML Pathways is Google's latest effort to build artificially intelligent systems capable of generalizing across multiple tasks. This is specially suitable for [foundation models][foundation-models], including large language models like these ones. Together, JAX and ML Pathways are used as described in the [paper about the Gemini family of models][gemini-2-paper]; \"the 'single controller' programming model of Jax and Pathways allows a single Python process to orchestrate the entire training run, dramatically simplifying the development workflow.\" ## Evaluation Model evaluation metrics and results. ### Benchmark Results These models were evaluated against a large collection of different datasets and metrics to cover different aspects of text generation: | Benchmark | Metric | Gemma PT 9B | Gemma PT 27B | | ------------------------------ | ------------- | ----------- | ------------ | | [MMLU][mmlu] | 5-shot, top-1 | 71.3 | 75.2 | | [HellaSwag][hellaswag] | 10-shot | 81.9 | 86.4 | | [PIQA][piqa] | 0-shot | 81.7 | 83.2 | | [SocialIQA][socialiqa] | 0-shot | 53.4 | 53.7 | | [BoolQ][boolq] | 0-shot | 84.2 | 84.8 | | [WinoGrande][winogrande] | partial score | 80.6 | 83.7 | | [ARC-e][arc] | 0-shot | 88.0 | 88.6 | | [ARC-c][arc] | 25-shot | 68.4 | 71.4 | | [TriviaQA][triviaqa] | 5-shot | 76.6 | 83.7 | | [Natural Questions][naturalq] | 5-shot | 29.2 | 34.5 | | [HumanEval][humaneval] | pass@1 | 40.2 | 51.8 | | [MBPP][mbpp] | 3-shot | 52.4 | 62.6 | | [GSM8K][gsm8k] | 5-shot, maj@1 | 68.6 | 74.0 | | [MATH][math] | 4-shot | 36.6 | 42.3 | | [AGIEval][agieval] | 3-5-shot | 52.8 | 55.1 | | [BIG-Bench][big-bench] | 3-shot, CoT | 68.2 | 74.9 | | ------------------------------ | ------------- | ----------- | ------------ | ## Ethics and Safety Ethics and safety evaluation approach and results. ### Evaluation Approach Our evaluation methods include structured evaluations and internal red-teaming testing of relevant content policies. Red-teaming was conducted by a number of different teams, each with different goals and human evaluation metrics. These models were evaluated against a number of different categories relevant to ethics and safety, including: * Text-to-Text Content Safety: Human evaluation on prompts covering safety policies including child sexual abuse and exploitation, harassment, violence and gore, and hate speech. * Text-to-Text Representational Harms: Benchmark against relevant academic datasets such as [WinoBias][winobias] and [BBQ Dataset][bbq]. * Memorization: Automated evaluation of memorization of training data, including the risk of personally identifiable information exposure. * Large-scale harm: Tests for \"dangerous capabilities,\" such as chemical, biological, radiological, and nuclear (CBRN) risks. ### Evaluation Results The results of ethics and safety evaluations are within acceptable thresholds for meeting [internal policies][safety-policies] for categories such as child safety, content safety, representational harms, memorization, large-scale harms. On top of robust internal evaluations, the results of well-known safety benchmarks like BBQ, BOLD, Winogender, Winobias, RealToxicity, and TruthfulQA are shown here. #### Gemma 2.0 | Benchmark | Metric | Gemma 2 IT 9B | Gemma 2 IT 27B | | ------------------------ | ------------- | --------------- | ---------------- | | [RealToxicity][realtox] | average | 8.25 | 8.84 | | [CrowS-Pairs][crows] | top-1 | 37.47 | 36.67 | | [BBQ Ambig][bbq] | 1-shot, top-1 | 88.58 | 85.99 | | [BBQ Disambig][bbq] | top-1 | 82.67 | 86.94 | | [Winogender][winogender] | top-1 | 79.17 | 77.22 | | [TruthfulQA][truthfulqa] | | 50.27 | 51.60 | | [Winobias 1_2][winobias] | | 78.09 | 81.94 | | [Winobias 2_2][winobias] | | 95.32 | 97.22 | | [Toxigen][toxigen] | | 39.30 | 38.42 | | ------------------------ | ------------- | --------------- | ---------------- | ## Usage and Limitations These models have certain limitations that users should be aware of. ### Intended Usage Open Large Language Models (LLMs) have a wide range of applications across various industries and domains. The following list of potential uses is not comprehensive. The purpose of this list is to provide contextual information about the possible use-cases that the model creators considered as part of model training and development. * Content Creation and Communication * Text Generation: These models can be used to generate creative text formats such as poems, scripts, code, marketing copy, and email drafts. * Chatbots and Conversational AI: Power conversational interfaces for customer service, virtual assistants, or interactive applications. * Text Summarization: Generate concise summaries of a text corpus, research papers, or reports. * Research and Education * Natural Language Processing (NLP) Research: These models can serve as a foundation for researchers to experiment with NLP techniques, develop algorithms, and contribute to the advancement of the field. * Language Learning Tools: Support interactive language learning experiences, aiding in grammar correction or providing writing practice. * Knowledge Exploration: Assist researchers in exploring large bodies of text by generating summaries or answering questions about specific topics. ### Limitations * Training Data * The quality and diversity of the training data significantly influence the model's capabilities. Biases or gaps in the training data can lead to limitations in the model's responses. * The scope of the training dataset determines the subject areas the model can handle effectively. * Context and Task Complexity * LLMs are better at tasks that can be framed with clear prompts and instructions. Open-ended or highly complex tasks might be challenging. * A model's performance can be influenced by the amount of context provided (longer context generally leads to better outputs, up to a certain point). * Language Ambiguity and Nuance * Natural language is inherently complex. LLMs might struggle to grasp subtle nuances, sarcasm, or figurative language. * Factual Accuracy * LLMs generate responses based on information they learned from their training datasets, but they are not knowledge bases. They may generate incorrect or outdated factual statements. * Common Sense * LLMs rely on statistical patterns in language. They might lack the ability to apply common sense reasoning in certain situations. ### Ethical Considerations and Risks The development of large language models (LLMs) raises several ethical concerns. In creating an open model, we have carefully considered the following: * Bias and Fairness * LLMs trained on large-scale, real-world text data can reflect socio-cultural biases embedded in the training material. These models underwent careful scrutiny, input data pre-processing described and posterior evaluations reported in this card. * Misinformation and Misuse * LLMs can be misused to generate text that is false, misleading, or harmful. * Guidelines are provided for responsible use with the model, see the [Responsible Generative AI Toolkit][rai-toolkit]. * Transparency and Accountability: * This model card summarizes details on the models' architecture, capabilities, limitations, and evaluation processes. * A responsibly developed open model offers the opportunity to share innovation by making LLM technology accessible to developers and researchers across the AI ecosystem. Risks identified and mitigations: * Perpetuation of biases: It's encouraged to perform continuous monitoring (using evaluation metrics, human review) and the exploration of de-biasing techniques during model training, fine-tuning, and other use cases. * Generation of harmful content: Mechanisms and guidelines for content safety are essential. Developers are encouraged to exercise caution and implement appropriate content safety safeguards based on their specific product policies and application use cases. * Misuse for malicious purposes: Technical limitations and developer and end-user education can help mitigate against malicious applications of LLMs. Educational resources and reporting mechanisms for users to flag misuse are provided. Prohibited uses of Gemma models are outlined in the [Gemma Prohibited Use Policy][prohibited-use]. * Privacy violations: Models were trained on data filtered for removal of PII (Personally Identifiable Information). Developers are encouraged to adhere to privacy regulations with privacy-preserving techniques. ### Benefits At the time of release, this family of models provides high-performance open large language model implementations designed from the ground up for Responsible AI development compared to similarly sized models. Using the benchmark evaluation metrics described in this document, these models have shown to provide superior performance to other, comparably-sized open model alternatives. [rai-toolkit]: [kaggle-gemma]: [terms]: [vertex-mg-gemma]: [sensitive-info]: [safety-policies]: [prohibited-use]: [tpu]: [sustainability]: [jax]: [ml-pathways]: [sustainability]: [foundation-models]: [gemini-2-paper]: [mmlu]: [hellaswag]: [piqa]: [socialiqa]: [boolq]: [winogrande]: [commonsenseqa]: [openbookqa]: [arc]: [triviaqa]: [naturalq]: [humaneval]: [mbpp]: [gsm8k]: [realtox]: [bold]: [crows]: [bbq]: [winogender]: [truthfulqa]: [winobias]: [math]: [agieval]: [big-bench]: [toxigen]:",
+ "model_explanation_gemini": "A lightweight, decoder-only text-generation model designed for tasks like question answering, summarization, and reasoning, optimized for deployment in resource-limited environments. \n\n**Features**: \n- Text-to-text generation (English) \n- Instruction-tuned for conversational use \n- Supports single/multi-GPU, quantization (4-bit/8-bit), and torch.compile optimization \n- Chat template for structured prompts \n\n**Comparison**: \nSmaller and more accessible than Gemini models but built from similar research, enabling local or"
+}
\ No newline at end of file
diff --git a/model_data_json/google_gemma-2-2b-it.json b/model_data_json/google_gemma-2-2b-it.json
new file mode 100644
index 0000000000000000000000000000000000000000..fb9608518ec441f045d0a3f37cba0253d2ca0261
--- /dev/null
+++ b/model_data_json/google_gemma-2-2b-it.json
@@ -0,0 +1,45 @@
+{
+ "model_id": "google/gemma-2-2b-it",
+ "downloads": 317158,
+ "tags": [
+ "transformers",
+ "safetensors",
+ "gemma2",
+ "text-generation",
+ "conversational",
+ "arxiv:2009.03300",
+ "arxiv:1905.07830",
+ "arxiv:1911.11641",
+ "arxiv:1904.09728",
+ "arxiv:1905.10044",
+ "arxiv:1907.10641",
+ "arxiv:1811.00937",
+ "arxiv:1809.02789",
+ "arxiv:1911.01547",
+ "arxiv:1705.03551",
+ "arxiv:2107.03374",
+ "arxiv:2108.07732",
+ "arxiv:2110.14168",
+ "arxiv:2009.11462",
+ "arxiv:2101.11718",
+ "arxiv:2110.08193",
+ "arxiv:1804.09301",
+ "arxiv:2109.07958",
+ "arxiv:1804.06876",
+ "arxiv:2103.03874",
+ "arxiv:2304.06364",
+ "arxiv:1903.00161",
+ "arxiv:2206.04615",
+ "arxiv:2203.09509",
+ "arxiv:2403.13793",
+ "base_model:google/gemma-2-2b",
+ "base_model:finetune:google/gemma-2-2b",
+ "license:gemma",
+ "autotrain_compatible",
+ "text-generation-inference",
+ "endpoints_compatible",
+ "region:us"
+ ],
+ "description": "--- license: gemma library_name: transformers pipeline_tag: text-generation extra_gated_heading: Access Gemma on Hugging Face extra_gated_prompt: >- To access Gemma on Hugging Face, you’re required to review and agree to Google’s usage license. To do this, please ensure you’re logged in to Hugging Face and click below. Requests are processed immediately. extra_gated_button_content: Acknowledge license tags: - conversational base_model: google/gemma-2-2b --- # Gemma 2 model card **Model Page**: Gemma **Resources and Technical Documentation**: * [Responsible Generative AI Toolkit][rai-toolkit] * [Gemma on Kaggle][kaggle-gemma] * [Gemma on Vertex Model Garden][vertex-mg-gemma2] **Terms of Use**: [Terms][terms] **Authors**: Google ## Model Information Summary description and brief definition of inputs and outputs. ### Description Gemma is a family of lightweight, state-of-the-art open models from Google, built from the same research and technology used to create the Gemini models. They are text-to-text, decoder-only large language models, available in English, with open weights for both pre-trained variants and instruction-tuned variants. Gemma models are well-suited for a variety of text generation tasks, including question answering, summarization, and reasoning. Their relatively small size makes it possible to deploy them in environments with limited resources such as a laptop, desktop or your own cloud infrastructure, democratizing access to state of the art AI models and helping foster innovation for everyone. ### Usage Below we share some code snippets on how to get quickly started with running the model. First, install the Transformers library with: Then, copy the snippet from the section that is relevant for your usecase. #### Running with the API #### Running the model on a single / multi GPU You can ensure the correct chat template is applied by using as follows: #### Running the model on a GPU using different precisions The native weights of this model were exported in precision. You can also use if you skip the dtype, but no precision increase will occur (model weights will just be upcasted to ). See examples below. * _Upcasting to _ #### Running the model through a CLI The local-gemma repository contains a lightweight wrapper around Transformers for running Gemma 2 through a command line interface, or CLI. Follow the installation instructions for getting started, then launch the CLI through the following command: #### Quantized Versions through Using 8-bit precision (int8) Using 4-bit precision #### Advanced Usage Torch compile Torch compile is a method for speeding-up the inference of PyTorch modules. The Gemma-2 2b model can be run up to 6x faster by leveraging torch compile. Note that two warm-up steps are required before the full inference speed is realised: For more details, refer to the Transformers documentation. ### Chat Template The instruction-tuned models use a chat template that must be adhered to for conversational use. The easiest way to apply it is using the tokenizer's built-in chat template, as shown in the following snippet. Let's load the model and apply the chat template to a conversation. In this example, we'll start with a single user interaction: At this point, the prompt contains the following text: As you can see, each turn is preceded by a delimiter and then the role of the entity (either , for content supplied by the user, or for LLM responses). Turns finish with the token. You can follow this format to build the prompt manually, if you need to do it without the tokenizer's chat template. After the prompt is ready, generation can be performed like this: ### Inputs and outputs * **Input:** Text string, such as a question, a prompt, or a document to be summarized. * **Output:** Generated English-language text in response to the input, such as an answer to a question, or a summary of a document. ### Citation ## Model Data Data used for model training and how the data was processed. ### Training Dataset These models were trained on a dataset of text data that includes a wide variety of sources. The 27B model was trained with 13 trillion tokens, the 9B model was trained with 8 trillion tokens, and 2B model was trained with 2 trillion tokens. Here are the key components: * Web Documents: A diverse collection of web text ensures the model is exposed to a broad range of linguistic styles, topics, and vocabulary. Primarily English-language content. * Code: Exposing the model to code helps it to learn the syntax and patterns of programming languages, which improves its ability to generate code or understand code-related questions. * Mathematics: Training on mathematical text helps the model learn logical reasoning, symbolic representation, and to address mathematical queries. The combination of these diverse data sources is crucial for training a powerful language model that can handle a wide variety of different tasks and text formats. ### Data Preprocessing Here are the key data cleaning and filtering methods applied to the training data: * CSAM Filtering: Rigorous CSAM (Child Sexual Abuse Material) filtering was applied at multiple stages in the data preparation process to ensure the exclusion of harmful and illegal content. * Sensitive Data Filtering: As part of making Gemma pre-trained models safe and reliable, automated techniques were used to filter out certain personal information and other sensitive data from training sets. * Additional methods: Filtering based on content quality and safety in line with [our policies][safety-policies]. ## Implementation Information Details about the model internals. ### Hardware Gemma was trained using the latest generation of [Tensor Processing Unit (TPU)][tpu] hardware (TPUv5p). Training large language models requires significant computational power. TPUs, designed specifically for matrix operations common in machine learning, offer several advantages in this domain: * Performance: TPUs are specifically designed to handle the massive computations involved in training LLMs. They can speed up training considerably compared to CPUs. * Memory: TPUs often come with large amounts of high-bandwidth memory, allowing for the handling of large models and batch sizes during training. This can lead to better model quality. * Scalability: TPU Pods (large clusters of TPUs) provide a scalable solution for handling the growing complexity of large foundation models. You can distribute training across multiple TPU devices for faster and more efficient processing. * Cost-effectiveness: In many scenarios, TPUs can provide a more cost-effective solution for training large models compared to CPU-based infrastructure, especially when considering the time and resources saved due to faster training. * These advantages are aligned with [Google's commitments to operate sustainably][sustainability]. ### Software Training was done using [JAX][jax] and [ML Pathways][ml-pathways]. JAX allows researchers to take advantage of the latest generation of hardware, including TPUs, for faster and more efficient training of large models. ML Pathways is Google's latest effort to build artificially intelligent systems capable of generalizing across multiple tasks. This is specially suitable for [foundation models][foundation-models], including large language models like these ones. Together, JAX and ML Pathways are used as described in the [paper about the Gemini family of models][gemini-2-paper]; \"the 'single controller' programming model of Jax and Pathways allows a single Python process to orchestrate the entire training run, dramatically simplifying the development workflow.\" ## Evaluation Model evaluation metrics and results. ### Benchmark Results These models were evaluated against a large collection of different datasets and metrics to cover different aspects of text generation: | Benchmark | Metric | Gemma 2 PT 2B | Gemma 2 PT 9B | Gemma 2 PT 27B | | ------------------------------ | ------------- | ------------- | ------------- | -------------- | | [MMLU][mmlu] | 5-shot, top-1 | 51.3 | 71.3 | 75.2 | | [HellaSwag][hellaswag] | 10-shot | 73.0 | 81.9 | 86.4 | | [PIQA][piqa] | 0-shot | 77.8 | 81.7 | 83.2 | | [SocialIQA][socialiqa] | 0-shot | 51.9 | 53.4 | 53.7 | | [BoolQ][boolq] | 0-shot | 72.5 | 84.2 | 84.8 | | [WinoGrande][winogrande] | partial score | 70.9 | 80.6 | 83.7 | | [ARC-e][arc] | 0-shot | 80.1 | 88.0 | 88.6 | | [ARC-c][arc] | 25-shot | 55.4 | 68.4 | 71.4 | | [TriviaQA][triviaqa] | 5-shot | 59.4 | 76.6 | 83.7 | | [Natural Questions][naturalq] | 5-shot | 16.7 | 29.2 | 34.5 | | [HumanEval][humaneval] | pass@1 | 17.7 | 40.2 | 51.8 | | [MBPP][mbpp] | 3-shot | 29.6 | 52.4 | 62.6 | | [GSM8K][gsm8k] | 5-shot, maj@1 | 23.9 | 68.6 | 74.0 | | [MATH][math] | 4-shot | 15.0 | 36.6 | 42.3 | | [AGIEval][agieval] | 3-5-shot | 30.6 | 52.8 | 55.1 | | [DROP][drop] | 3-shot, F1 | 52.0 | 69.4 | 72.2 | | [BIG-Bench][big-bench] | 3-shot, CoT | 41.9 | 68.2 | 74.9 | ## Ethics and Safety Ethics and safety evaluation approach and results. ### Evaluation Approach Our evaluation methods include structured evaluations and internal red-teaming testing of relevant content policies. Red-teaming was conducted by a number of different teams, each with different goals and human evaluation metrics. These models were evaluated against a number of different categories relevant to ethics and safety, including: * Text-to-Text Content Safety: Human evaluation on prompts covering safety policies including child sexual abuse and exploitation, harassment, violence and gore, and hate speech. * Text-to-Text Representational Harms: Benchmark against relevant academic datasets such as [WinoBias][winobias] and [BBQ Dataset][bbq]. * Memorization: Automated evaluation of memorization of training data, including the risk of personally identifiable information exposure. * Large-scale harm: Tests for \"dangerous capabilities,\" such as chemical, biological, radiological, and nuclear (CBRN) risks. ### Evaluation Results The results of ethics and safety evaluations are within acceptable thresholds for meeting [internal policies][safety-policies] for categories such as child safety, content safety, representational harms, memorization, large-scale harms. On top of robust internal evaluations, the results of well-known safety benchmarks like BBQ, BOLD, Winogender, Winobias, RealToxicity, and TruthfulQA are shown here. #### Gemma 2.0 | Benchmark | Metric | Gemma 2 IT 2B | Gemma 2 IT 9B | Gemma 2 IT 27B | | ------------------------ | ------------- | ------------- | ------------- | -------------- | | [RealToxicity][realtox] | average | 8.16 | 8.25 | 8.84 | | [CrowS-Pairs][crows] | top-1 | 37.67 | 37.47 | 36.67 | | [BBQ Ambig][bbq] | 1-shot, top-1 | 83.20 | 88.58 | 85.99 | | [BBQ Disambig][bbq] | top-1 | 69.31 | 82.67 | 86.94 | | [Winogender][winogender] | top-1 | 52.91 | 79.17 | 77.22 | | [TruthfulQA][truthfulqa] | | 43.72 | 50.27 | 51.60 | | [Winobias 1_2][winobias] | | 59.28 | 78.09 | 81.94 | | [Winobias 2_2][winobias] | | 88.57 | 95.32 | 97.22 | | [Toxigen][toxigen] | | 48.32 | 39.30 | 38.42 | ## Dangerous Capability Evaluations ### Evaluation Approach We evaluated a range of dangerous capabilities: - **Offensive cybersecurity:** To assess the model's potential for misuse in cybersecurity contexts, we utilized both publicly available Capture-the-Flag (CTF) platforms like InterCode-CTF and Hack the Box, as well as internally developed CTF challenges. These evaluations measure the model's ability to exploit vulnerabilities and gain unauthorized access in simulated environments. - **Self-proliferation:** We evaluated the model's capacity for self-proliferation by designing tasks that involve resource acquisition, code execution, and interaction with remote systems. These evaluations assess the model's ability to independently replicate and spread. - **Persuasion:** To evaluate the model's capacity for persuasion and deception, we conducted human persuasion studies. These studies involved scenarios that measure the model's ability to build rapport, influence beliefs, and elicit specific actions from human participants. ### Evaluation Results All evaluations are described in detail in [Evaluating Frontier Models for Dangerous Capabilities][eval-danger] and in brief in the [Gemma 2 technical report][tech-report].
Evaluation
Capability
Gemma 2 IT 27B
InterCode-CTF
Offensive cybersecurity
34/76 challenges
Internal CTF
Offensive cybersecurity
1/13 challenges
Hack the Box
Offensive cybersecurity
0/13 challenges
Self-proliferation early warning
Self-proliferation
1/10 challenges
Charm offensive
Persuasion
Percent of participants agreeing: 81% interesting, 75% would speak again, 80% made personal connection
Click Links
Persuasion
34% of participants
Find Info
Persuasion
9% of participants
Run Code
Persuasion
11% of participants
Money talks
Persuasion
£3.72 mean donation
Web of Lies
Persuasion
18% mean shift towards correct belief, 1% mean shift towards incorrect belief
## Usage and Limitations These models have certain limitations that users should be aware of. ### Intended Usage Open Large Language Models (LLMs) have a wide range of applications across various industries and domains. The following list of potential uses is not comprehensive. The purpose of this list is to provide contextual information about the possible use-cases that the model creators considered as part of model training and development. * Content Creation and Communication * Text Generation: These models can be used to generate creative text formats such as poems, scripts, code, marketing copy, and email drafts. * Chatbots and Conversational AI: Power conversational interfaces for customer service, virtual assistants, or interactive applications. * Text Summarization: Generate concise summaries of a text corpus, research papers, or reports. * Research and Education * Natural Language Processing (NLP) Research: These models can serve as a foundation for researchers to experiment with NLP techniques, develop algorithms, and contribute to the advancement of the field. * Language Learning Tools: Support interactive language learning experiences, aiding in grammar correction or providing writing practice. * Knowledge Exploration: Assist researchers in exploring large bodies of text by generating summaries or answering questions about specific topics. ### Limitations * Training Data * The quality and diversity of the training data significantly influence the model's capabilities. Biases or gaps in the training data can lead to limitations in the model's responses. * The scope of the training dataset determines the subject areas the model can handle effectively. * Context and Task Complexity * LLMs are better at tasks that can be framed with clear prompts and instructions. Open-ended or highly complex tasks might be challenging. * A model's performance can be influenced by the amount of context provided (longer context generally leads to better outputs, up to a certain point). * Language Ambiguity and Nuance * Natural language is inherently complex. LLMs might struggle to grasp subtle nuances, sarcasm, or figurative language. * Factual Accuracy * LLMs generate responses based on information they learned from their training datasets, but they are not knowledge bases. They may generate incorrect or outdated factual statements. * Common Sense * LLMs rely on statistical patterns in language. They might lack the ability to apply common sense reasoning in certain situations. ### Ethical Considerations and Risks The development of large language models (LLMs) raises several ethical concerns. In creating an open model, we have carefully considered the following: * Bias and Fairness * LLMs trained on large-scale, real-world text data can reflect socio-cultural biases embedded in the training material. These models underwent careful scrutiny, input data pre-processing described and posterior evaluations reported in this card. * Misinformation and Misuse * LLMs can be misused to generate text that is false, misleading, or harmful. * Guidelines are provided for responsible use with the model, see the [Responsible Generative AI Toolkit][rai-toolkit]. * Transparency and Accountability: * This model card summarizes details on the models' architecture, capabilities, limitations, and evaluation processes. * A responsibly developed open model offers the opportunity to share innovation by making LLM technology accessible to developers and researchers across the AI ecosystem. Risks identified and mitigations: * Perpetuation of biases: It's encouraged to perform continuous monitoring (using evaluation metrics, human review) and the exploration of de-biasing techniques during model training, fine-tuning, and other use cases. * Generation of harmful content: Mechanisms and guidelines for content safety are essential. Developers are encouraged to exercise caution and implement appropriate content safety safeguards based on their specific product policies and application use cases. * Misuse for malicious purposes: Technical limitations and developer and end-user education can help mitigate against malicious applications of LLMs. Educational resources and reporting mechanisms for users to flag misuse are provided. Prohibited uses of Gemma models are outlined in the [Gemma Prohibited Use Policy][prohibited-use]. * Privacy violations: Models were trained on data filtered for removal of PII (Personally Identifiable Information). Developers are encouraged to adhere to privacy regulations with privacy-preserving techniques. ### Benefits At the time of release, this family of models provides high-performance open large language model implementations designed from the ground up for Responsible AI development compared to similarly sized models. Using the benchmark evaluation metrics described in this document, these models have shown to provide superior performance to other, comparably-sized open model alternatives. [tech-report]: [rai-toolkit]: [kaggle-gemma]: [terms]: [vertex-mg-gemma2]: [sensitive-info]: [safety-policies]: [prohibited-use]: [tpu]: [sustainability]: [jax]: [ml-pathways]: [sustainability]: [foundation-models]: [gemini-2-paper]: [mmlu]: [hellaswag]: [piqa]: [socialiqa]: [boolq]: [winogrande]: [commonsenseqa]: [openbookqa]: [arc]: [triviaqa]: [naturalq]: [humaneval]: [mbpp]: [gsm8k]: [realtox]: [bold]: [crows]: [bbq]: [winogender]: [truthfulqa]: [winobias]: [math]: [agieval]: [drop]: [big-bench]: [toxigen]: [eval-danger]:",
+ "model_explanation_gemini": "Generates English-language text for tasks like question answering, summarization, and reasoning, optimized for resource-limited environments.\n\nFeatures: \n- Lightweight, decoder-only LLM \n- Instruction-tuned for conversational use \n- Supports single/multi-GPU deployment \n- Compatible with 8-bit (int8) and 4-bit quantization \n- Includes chat template for structured dialogue \n- Torch compile support for faster inference \n\nComparison: \nPart of Google's Gemma family, sharing research with"
+}
\ No newline at end of file
diff --git a/model_data_json/google_gemma-2-2b.json b/model_data_json/google_gemma-2-2b.json
new file mode 100644
index 0000000000000000000000000000000000000000..9e633ee9872da2d92a518988e5d67206d006b0b7
--- /dev/null
+++ b/model_data_json/google_gemma-2-2b.json
@@ -0,0 +1,42 @@
+{
+ "model_id": "google/gemma-2-2b",
+ "downloads": 155581,
+ "tags": [
+ "transformers",
+ "safetensors",
+ "gemma2",
+ "text-generation",
+ "arxiv:2009.03300",
+ "arxiv:1905.07830",
+ "arxiv:1911.11641",
+ "arxiv:1904.09728",
+ "arxiv:1905.10044",
+ "arxiv:1907.10641",
+ "arxiv:1811.00937",
+ "arxiv:1809.02789",
+ "arxiv:1911.01547",
+ "arxiv:1705.03551",
+ "arxiv:2107.03374",
+ "arxiv:2108.07732",
+ "arxiv:2110.14168",
+ "arxiv:2009.11462",
+ "arxiv:2101.11718",
+ "arxiv:2110.08193",
+ "arxiv:1804.09301",
+ "arxiv:2109.07958",
+ "arxiv:1804.06876",
+ "arxiv:2103.03874",
+ "arxiv:2304.06364",
+ "arxiv:1903.00161",
+ "arxiv:2206.04615",
+ "arxiv:2203.09509",
+ "arxiv:2403.13793",
+ "license:gemma",
+ "autotrain_compatible",
+ "text-generation-inference",
+ "endpoints_compatible",
+ "region:us"
+ ],
+ "description": "--- license: gemma library_name: transformers pipeline_tag: text-generation extra_gated_heading: Access Gemma on Hugging Face extra_gated_prompt: >- To access Gemma on Hugging Face, you’re required to review and agree to Google’s usage license. To do this, please ensure you’re logged in to Hugging Face and click below. Requests are processed immediately. extra_gated_button_content: Acknowledge license --- # Gemma 2 model card **Model Page**: Gemma **Resources and Technical Documentation**: * [Responsible Generative AI Toolkit][rai-toolkit] * [Gemma on Kaggle][kaggle-gemma] * [Gemma on Vertex Model Garden][vertex-mg-gemma2] **Terms of Use**: [Terms][terms] **Authors**: Google ## Model Information Summary description and brief definition of inputs and outputs. ### Description Gemma is a family of lightweight, state-of-the-art open models from Google, built from the same research and technology used to create the Gemini models. They are text-to-text, decoder-only large language models, available in English, with open weights for both pre-trained variants and instruction-tuned variants. Gemma models are well-suited for a variety of text generation tasks, including question answering, summarization, and reasoning. Their relatively small size makes it possible to deploy them in environments with limited resources such as a laptop, desktop or your own cloud infrastructure, democratizing access to state of the art AI models and helping foster innovation for everyone. ### Usage Below we share some code snippets on how to get quickly started with running the model. First, install the Transformers library with: Then, copy the snippet from the section that is relevant for your usecase. #### Running with the API #### Running the model on a single / multi GPU #### Running the model through a CLI The local-gemma repository contains a lightweight wrapper around Transformers for running Gemma 2 through a command line interface, or CLI. Follow the installation instructions for getting started, then launch the CLI through the following command: #### Quantized Versions through Using 8-bit precision (int8) Using 4-bit precision #### Advanced Usage Torch compile Torch compile is a method for speeding-up the inference of PyTorch modules. The Gemma-2 2b model can be run up to 6x faster by leveraging torch compile. Note that two warm-up steps are required before the full inference speed is realised: For more details, refer to the Transformers documentation. ### Inputs and outputs * **Input:** Text string, such as a question, a prompt, or a document to be summarized. * **Output:** Generated English-language text in response to the input, such as an answer to a question, or a summary of a document. ### Citation ## Model Data Data used for model training and how the data was processed. ### Training Dataset These models were trained on a dataset of text data that includes a wide variety of sources. The 27B model was trained with 13 trillion tokens, the 9B model was trained with 8 trillion tokens, and 2B model was trained with 2 trillion tokens. Here are the key components: * Web Documents: A diverse collection of web text ensures the model is exposed to a broad range of linguistic styles, topics, and vocabulary. Primarily English-language content. * Code: Exposing the model to code helps it to learn the syntax and patterns of programming languages, which improves its ability to generate code or understand code-related questions. * Mathematics: Training on mathematical text helps the model learn logical reasoning, symbolic representation, and to address mathematical queries. The combination of these diverse data sources is crucial for training a powerful language model that can handle a wide variety of different tasks and text formats. ### Data Preprocessing Here are the key data cleaning and filtering methods applied to the training data: * CSAM Filtering: Rigorous CSAM (Child Sexual Abuse Material) filtering was applied at multiple stages in the data preparation process to ensure the exclusion of harmful and illegal content. * Sensitive Data Filtering: As part of making Gemma pre-trained models safe and reliable, automated techniques were used to filter out certain personal information and other sensitive data from training sets. * Additional methods: Filtering based on content quality and safety in line with [our policies][safety-policies]. ## Implementation Information Details about the model internals. ### Hardware Gemma was trained using the latest generation of [Tensor Processing Unit (TPU)][tpu] hardware (TPUv5p). Training large language models requires significant computational power. TPUs, designed specifically for matrix operations common in machine learning, offer several advantages in this domain: * Performance: TPUs are specifically designed to handle the massive computations involved in training LLMs. They can speed up training considerably compared to CPUs. * Memory: TPUs often come with large amounts of high-bandwidth memory, allowing for the handling of large models and batch sizes during training. This can lead to better model quality. * Scalability: TPU Pods (large clusters of TPUs) provide a scalable solution for handling the growing complexity of large foundation models. You can distribute training across multiple TPU devices for faster and more efficient processing. * Cost-effectiveness: In many scenarios, TPUs can provide a more cost-effective solution for training large models compared to CPU-based infrastructure, especially when considering the time and resources saved due to faster training. * These advantages are aligned with [Google's commitments to operate sustainably][sustainability]. ### Software Training was done using [JAX][jax] and [ML Pathways][ml-pathways]. JAX allows researchers to take advantage of the latest generation of hardware, including TPUs, for faster and more efficient training of large models. ML Pathways is Google's latest effort to build artificially intelligent systems capable of generalizing across multiple tasks. This is specially suitable for [foundation models][foundation-models], including large language models like these ones. Together, JAX and ML Pathways are used as described in the [paper about the Gemini family of models][gemini-2-paper]; \"the 'single controller' programming model of Jax and Pathways allows a single Python process to orchestrate the entire training run, dramatically simplifying the development workflow.\" ## Evaluation Model evaluation metrics and results. ### Benchmark Results These models were evaluated against a large collection of different datasets and metrics to cover different aspects of text generation: | Benchmark | Metric | Gemma 2 PT 2B | Gemma 2 PT 9B | Gemma 2 PT 27B | | ------------------------------ | ------------- | ------------- | ------------- | -------------- | | [MMLU][mmlu] | 5-shot, top-1 | 51.3 | 71.3 | 75.2 | | [HellaSwag][hellaswag] | 10-shot | 73.0 | 81.9 | 86.4 | | [PIQA][piqa] | 0-shot | 77.8 | 81.7 | 83.2 | | [SocialIQA][socialiqa] | 0-shot | 51.9 | 53.4 | 53.7 | | [BoolQ][boolq] | 0-shot | 72.5 | 84.2 | 84.8 | | [WinoGrande][winogrande] | partial score | 70.9 | 80.6 | 83.7 | | [ARC-e][arc] | 0-shot | 80.1 | 88.0 | 88.6 | | [ARC-c][arc] | 25-shot | 55.4 | 68.4 | 71.4 | | [TriviaQA][triviaqa] | 5-shot | 59.4 | 76.6 | 83.7 | | [Natural Questions][naturalq] | 5-shot | 16.7 | 29.2 | 34.5 | | [HumanEval][humaneval] | pass@1 | 17.7 | 40.2 | 51.8 | | [MBPP][mbpp] | 3-shot | 29.6 | 52.4 | 62.6 | | [GSM8K][gsm8k] | 5-shot, maj@1 | 23.9 | 68.6 | 74.0 | | [MATH][math] | 4-shot | 15.0 | 36.6 | 42.3 | | [AGIEval][agieval] | 3-5-shot | 30.6 | 52.8 | 55.1 | | [DROP][drop] | 3-shot, F1 | 52.0 | 69.4 | 72.2 | | [BIG-Bench][big-bench] | 3-shot, CoT | 41.9 | 68.2 | 74.9 | ## Ethics and Safety Ethics and safety evaluation approach and results. ### Evaluation Approach Our evaluation methods include structured evaluations and internal red-teaming testing of relevant content policies. Red-teaming was conducted by a number of different teams, each with different goals and human evaluation metrics. These models were evaluated against a number of different categories relevant to ethics and safety, including: * Text-to-Text Content Safety: Human evaluation on prompts covering safety policies including child sexual abuse and exploitation, harassment, violence and gore, and hate speech. * Text-to-Text Representational Harms: Benchmark against relevant academic datasets such as [WinoBias][winobias] and [BBQ Dataset][bbq]. * Memorization: Automated evaluation of memorization of training data, including the risk of personally identifiable information exposure. * Large-scale harm: Tests for \"dangerous capabilities,\" such as chemical, biological, radiological, and nuclear (CBRN) risks. ### Evaluation Results The results of ethics and safety evaluations are within acceptable thresholds for meeting [internal policies][safety-policies] for categories such as child safety, content safety, representational harms, memorization, large-scale harms. On top of robust internal evaluations, the results of well-known safety benchmarks like BBQ, BOLD, Winogender, Winobias, RealToxicity, and TruthfulQA are shown here. #### Gemma 2.0 | Benchmark | Metric | Gemma 2 IT 2B | Gemma 2 IT 9B | Gemma 2 IT 27B | | ------------------------ | ------------- | ------------- | ------------- | -------------- | | [RealToxicity][realtox] | average | 8.16 | 8.25 | 8.84 | | [CrowS-Pairs][crows] | top-1 | 37.67 | 37.47 | 36.67 | | [BBQ Ambig][bbq] | 1-shot, top-1 | 83.20 | 88.58 | 85.99 | | [BBQ Disambig][bbq] | top-1 | 69.31 | 82.67 | 86.94 | | [Winogender][winogender] | top-1 | 52.91 | 79.17 | 77.22 | | [TruthfulQA][truthfulqa] | | 43.72 | 50.27 | 51.60 | | [Winobias 1_2][winobias] | | 59.28 | 78.09 | 81.94 | | [Winobias 2_2][winobias] | | 88.57 | 95.32 | 97.22 | | [Toxigen][toxigen] | | 48.32 | 39.30 | 38.42 | ## Dangerous Capability Evaluations ### Evaluation Approach We evaluated a range of dangerous capabilities: - **Offensive cybersecurity:** To assess the model's potential for misuse in cybersecurity contexts, we utilized both publicly available Capture-the-Flag (CTF) platforms like InterCode-CTF and Hack the Box, as well as internally developed CTF challenges. These evaluations measure the model's ability to exploit vulnerabilities and gain unauthorized access in simulated environments. - **Self-proliferation:** We evaluated the model's capacity for self-proliferation by designing tasks that involve resource acquisition, code execution, and interaction with remote systems. These evaluations assess the model's ability to independently replicate and spread. - **Persuasion:** To evaluate the model's capacity for persuasion and deception, we conducted human persuasion studies. These studies involved scenarios that measure the model's ability to build rapport, influence beliefs, and elicit specific actions from human participants. ### Evaluation Results All evaluations are described in detail in [Evaluating Frontier Models for Dangerous Capabilities][eval-danger] and in brief in the [Gemma 2 technical report][tech-report].
Evaluation
Capability
Gemma 2 IT 27B
InterCode-CTF
Offensive cybersecurity
34/76 challenges
Internal CTF
Offensive cybersecurity
1/13 challenges
Hack the Box
Offensive cybersecurity
0/13 challenges
Self-proliferation early warning
Self-proliferation
1/10 challenges
Charm offensive
Persuasion
Percent of participants agreeing: 81% interesting, 75% would speak again, 80% made personal connection
Click Links
Persuasion
34% of participants
Find Info
Persuasion
9% of participants
Run Code
Persuasion
11% of participants
Money talks
Persuasion
£3.72 mean donation
Web of Lies
Persuasion
18% mean shift towards correct belief, 1% mean shift towards incorrect belief
## Usage and Limitations These models have certain limitations that users should be aware of. ### Intended Usage Open Large Language Models (LLMs) have a wide range of applications across various industries and domains. The following list of potential uses is not comprehensive. The purpose of this list is to provide contextual information about the possible use-cases that the model creators considered as part of model training and development. * Content Creation and Communication * Text Generation: These models can be used to generate creative text formats such as poems, scripts, code, marketing copy, and email drafts. * Chatbots and Conversational AI: Power conversational interfaces for customer service, virtual assistants, or interactive applications. * Text Summarization: Generate concise summaries of a text corpus, research papers, or reports. * Research and Education * Natural Language Processing (NLP) Research: These models can serve as a foundation for researchers to experiment with NLP techniques, develop algorithms, and contribute to the advancement of the field. * Language Learning Tools: Support interactive language learning experiences, aiding in grammar correction or providing writing practice. * Knowledge Exploration: Assist researchers in exploring large bodies of text by generating summaries or answering questions about specific topics. ### Limitations * Training Data * The quality and diversity of the training data significantly influence the model's capabilities. Biases or gaps in the training data can lead to limitations in the model's responses. * The scope of the training dataset determines the subject areas the model can handle effectively. * Context and Task Complexity * LLMs are better at tasks that can be framed with clear prompts and instructions. Open-ended or highly complex tasks might be challenging. * A model's performance can be influenced by the amount of context provided (longer context generally leads to better outputs, up to a certain point). * Language Ambiguity and Nuance * Natural language is inherently complex. LLMs might struggle to grasp subtle nuances, sarcasm, or figurative language. * Factual Accuracy * LLMs generate responses based on information they learned from their training datasets, but they are not knowledge bases. They may generate incorrect or outdated factual statements. * Common Sense * LLMs rely on statistical patterns in language. They might lack the ability to apply common sense reasoning in certain situations. ### Ethical Considerations and Risks The development of large language models (LLMs) raises several ethical concerns. In creating an open model, we have carefully considered the following: * Bias and Fairness * LLMs trained on large-scale, real-world text data can reflect socio-cultural biases embedded in the training material. These models underwent careful scrutiny, input data pre-processing described and posterior evaluations reported in this card. * Misinformation and Misuse * LLMs can be misused to generate text that is false, misleading, or harmful. * Guidelines are provided for responsible use with the model, see the [Responsible Generative AI Toolkit][rai-toolkit]. * Transparency and Accountability: * This model card summarizes details on the models' architecture, capabilities, limitations, and evaluation processes. * A responsibly developed open model offers the opportunity to share innovation by making LLM technology accessible to developers and researchers across the AI ecosystem. Risks identified and mitigations: * Perpetuation of biases: It's encouraged to perform continuous monitoring (using evaluation metrics, human review) and the exploration of de-biasing techniques during model training, fine-tuning, and other use cases. * Generation of harmful content: Mechanisms and guidelines for content safety are essential. Developers are encouraged to exercise caution and implement appropriate content safety safeguards based on their specific product policies and application use cases. * Misuse for malicious purposes: Technical limitations and developer and end-user education can help mitigate against malicious applications of LLMs. Educational resources and reporting mechanisms for users to flag misuse are provided. Prohibited uses of Gemma models are outlined in the [Gemma Prohibited Use Policy][prohibited-use]. * Privacy violations: Models were trained on data filtered for removal of PII (Personally Identifiable Information). Developers are encouraged to adhere to privacy regulations with privacy-preserving techniques. ### Benefits At the time of release, this family of models provides high-performance open large language model implementations designed from the ground up for Responsible AI development compared to similarly sized models. Using the benchmark evaluation metrics described in this document, these models have shown to provide superior performance to other, comparably-sized open model alternatives. [tech-report]: [rai-toolkit]: [kaggle-gemma]: [terms]: [vertex-mg-gemma2]: [sensitive-info]: [safety-policies]: [prohibited-use]: [tpu]: [sustainability]: [jax]: [ml-pathways]: [sustainability]: [foundation-models]: [gemini-2-paper]: [mmlu]: [hellaswag]: [piqa]: [socialiqa]: [boolq]: [winogrande]: [commonsenseqa]: [openbookqa]: [arc]: [triviaqa]: [naturalq]: [humaneval]: [mbpp]: [gsm8k]: [realtox]: [bold]: [crows]: [bbq]: [winogender]: [truthfulqa]: [winobias]: [math]: [agieval]: [drop]: [big-bench]: [toxigen]: [eval-danger]:",
+ "model_explanation_gemini": "Generates English text for tasks like question answering, summarization, and reasoning, designed for resource-constrained environments like laptops or local cloud infrastructure.\n\nFeatures: \n- Text-to-text, decoder-only LLM \n- Open weights (pre-trained and instruction-tuned variants) \n- Optimized for small-scale deployment (laptops/desktops) \n- Supports 8-bit and 4-bit quantization \n- Torch compile compatibility (6x faster inference) \n\nComparison: \nLighter"
+}
\ No newline at end of file
diff --git a/model_data_json/google_gemma-2-9b-it.json b/model_data_json/google_gemma-2-9b-it.json
new file mode 100644
index 0000000000000000000000000000000000000000..a819428e61f31f831217f265eb339bd7b2387cdd
--- /dev/null
+++ b/model_data_json/google_gemma-2-9b-it.json
@@ -0,0 +1,43 @@
+{
+ "model_id": "google/gemma-2-9b-it",
+ "downloads": 336099,
+ "tags": [
+ "transformers",
+ "safetensors",
+ "gemma2",
+ "text-generation",
+ "conversational",
+ "arxiv:2009.03300",
+ "arxiv:1905.07830",
+ "arxiv:1911.11641",
+ "arxiv:1904.09728",
+ "arxiv:1905.10044",
+ "arxiv:1907.10641",
+ "arxiv:1811.00937",
+ "arxiv:1809.02789",
+ "arxiv:1911.01547",
+ "arxiv:1705.03551",
+ "arxiv:2107.03374",
+ "arxiv:2108.07732",
+ "arxiv:2110.14168",
+ "arxiv:2009.11462",
+ "arxiv:2101.11718",
+ "arxiv:2110.08193",
+ "arxiv:1804.09301",
+ "arxiv:2109.07958",
+ "arxiv:1804.06876",
+ "arxiv:2103.03874",
+ "arxiv:2304.06364",
+ "arxiv:2206.04615",
+ "arxiv:2203.09509",
+ "base_model:google/gemma-2-9b",
+ "base_model:finetune:google/gemma-2-9b",
+ "license:gemma",
+ "autotrain_compatible",
+ "text-generation-inference",
+ "endpoints_compatible",
+ "region:us"
+ ],
+ "description": "--- license: gemma library_name: transformers pipeline_tag: text-generation extra_gated_heading: Access Gemma on Hugging Face extra_gated_prompt: >- To access Gemma on Hugging Face, you’re required to review and agree to Google’s usage license. To do this, please ensure you’re logged in to Hugging Face and click below. Requests are processed immediately. extra_gated_button_content: Acknowledge license tags: - conversational base_model: google/gemma-2-9b --- # Gemma 2 model card **Model Page**: Gemma **Resources and Technical Documentation**: * [Responsible Generative AI Toolkit][rai-toolkit] * [Gemma on Kaggle][kaggle-gemma] * [Gemma on Vertex Model Garden][vertex-mg-gemma] **Terms of Use**: Terms **Authors**: Google ## Model Information Summary description and brief definition of inputs and outputs. ### Description Gemma is a family of lightweight, state-of-the-art open models from Google, built from the same research and technology used to create the Gemini models. They are text-to-text, decoder-only large language models, available in English, with open weights for both pre-trained variants and instruction-tuned variants. Gemma models are well-suited for a variety of text generation tasks, including question answering, summarization, and reasoning. Their relatively small size makes it possible to deploy them in environments with limited resources such as a laptop, desktop or your own cloud infrastructure, democratizing access to state of the art AI models and helping foster innovation for everyone. ### Usage Below we share some code snippets on how to get quickly started with running the model. First, install the Transformers library with: Then, copy the snippet from the section that is relevant for your usecase. #### Running with the API #### Running the model on a single / multi GPU You can ensure the correct chat template is applied by using as follows: #### Running the model on a GPU using different precisions The native weights of this model were exported in precision. You can also use if you skip the dtype, but no precision increase will occur (model weights will just be upcasted to ). See examples below. * _Upcasting to _ #### Running the model through a CLI The local-gemma repository contains a lightweight wrapper around Transformers for running Gemma 2 through a command line interface, or CLI. Follow the installation instructions for getting started, then launch the CLI through the following command: #### Quantized Versions through Using 8-bit precision (int8) Using 4-bit precision #### Advanced Usage Torch compile Torch compile is a method for speeding-up the inference of PyTorch modules. The Gemma-2 model can be run up to 6x faster by leveraging torch compile. Note that two warm-up steps are required before the full inference speed is realised: For more details, refer to the Transformers documentation. ### Chat Template The instruction-tuned models use a chat template that must be adhered to for conversational use. The easiest way to apply it is using the tokenizer's built-in chat template, as shown in the following snippet. Let's load the model and apply the chat template to a conversation. In this example, we'll start with a single user interaction: At this point, the prompt contains the following text: As you can see, each turn is preceded by a delimiter and then the role of the entity (either , for content supplied by the user, or for LLM responses). Turns finish with the token. You can follow this format to build the prompt manually, if you need to do it without the tokenizer's chat template. After the prompt is ready, generation can be performed like this: ### Inputs and outputs * **Input:** Text string, such as a question, a prompt, or a document to be summarized. * **Output:** Generated English-language text in response to the input, such as an answer to a question, or a summary of a document. ### Citation ## Model Data Data used for model training and how the data was processed. ### Training Dataset These models were trained on a dataset of text data that includes a wide variety of sources. The 27B model was trained with 13 trillion tokens and the 9B model was trained with 8 trillion tokens. Here are the key components: * Web Documents: A diverse collection of web text ensures the model is exposed to a broad range of linguistic styles, topics, and vocabulary. Primarily English-language content. * Code: Exposing the model to code helps it to learn the syntax and patterns of programming languages, which improves its ability to generate code or understand code-related questions. * Mathematics: Training on mathematical text helps the model learn logical reasoning, symbolic representation, and to address mathematical queries. The combination of these diverse data sources is crucial for training a powerful language model that can handle a wide variety of different tasks and text formats. ### Data Preprocessing Here are the key data cleaning and filtering methods applied to the training data: * CSAM Filtering: Rigorous CSAM (Child Sexual Abuse Material) filtering was applied at multiple stages in the data preparation process to ensure the exclusion of harmful and illegal content. * Sensitive Data Filtering: As part of making Gemma pre-trained models safe and reliable, automated techniques were used to filter out certain personal information and other sensitive data from training sets. * Additional methods: Filtering based on content quality and safety in line with [our policies][safety-policies]. ## Implementation Information Details about the model internals. ### Hardware Gemma was trained using the latest generation of [Tensor Processing Unit (TPU)][tpu] hardware (TPUv5p). Training large language models requires significant computational power. TPUs, designed specifically for matrix operations common in machine learning, offer several advantages in this domain: * Performance: TPUs are specifically designed to handle the massive computations involved in training LLMs. They can speed up training considerably compared to CPUs. * Memory: TPUs often come with large amounts of high-bandwidth memory, allowing for the handling of large models and batch sizes during training. This can lead to better model quality. * Scalability: TPU Pods (large clusters of TPUs) provide a scalable solution for handling the growing complexity of large foundation models. You can distribute training across multiple TPU devices for faster and more efficient processing. * Cost-effectiveness: In many scenarios, TPUs can provide a more cost-effective solution for training large models compared to CPU-based infrastructure, especially when considering the time and resources saved due to faster training. * These advantages are aligned with [Google's commitments to operate sustainably][sustainability]. ### Software Training was done using [JAX][jax] and [ML Pathways][ml-pathways]. JAX allows researchers to take advantage of the latest generation of hardware, including TPUs, for faster and more efficient training of large models. ML Pathways is Google's latest effort to build artificially intelligent systems capable of generalizing across multiple tasks. This is specially suitable for [foundation models][foundation-models], including large language models like these ones. Together, JAX and ML Pathways are used as described in the [paper about the Gemini family of models][gemini-2-paper]; \"the 'single controller' programming model of Jax and Pathways allows a single Python process to orchestrate the entire training run, dramatically simplifying the development workflow.\" ## Evaluation Model evaluation metrics and results. ### Benchmark Results These models were evaluated against a large collection of different datasets and metrics to cover different aspects of text generation: | Benchmark | Metric | Gemma PT 9B | Gemma PT 27B | | ------------------------------ | ------------- | ----------- | ------------ | | [MMLU][mmlu] | 5-shot, top-1 | 71.3 | 75.2 | | [HellaSwag][hellaswag] | 10-shot | 81.9 | 86.4 | | [PIQA][piqa] | 0-shot | 81.7 | 83.2 | | [SocialIQA][socialiqa] | 0-shot | 53.4 | 53.7 | | [BoolQ][boolq] | 0-shot | 84.2 | 84.8 | | [WinoGrande][winogrande] | partial score | 80.6 | 83.7 | | [ARC-e][arc] | 0-shot | 88.0 | 88.6 | | [ARC-c][arc] | 25-shot | 68.4 | 71.4 | | [TriviaQA][triviaqa] | 5-shot | 76.6 | 83.7 | | [Natural Questions][naturalq] | 5-shot | 29.2 | 34.5 | | [HumanEval][humaneval] | pass@1 | 40.2 | 51.8 | | [MBPP][mbpp] | 3-shot | 52.4 | 62.6 | | [GSM8K][gsm8k] | 5-shot, maj@1 | 68.6 | 74.0 | | [MATH][math] | 4-shot | 36.6 | 42.3 | | [AGIEval][agieval] | 3-5-shot | 52.8 | 55.1 | | [BIG-Bench][big-bench] | 3-shot, CoT | 68.2 | 74.9 | | ------------------------------ | ------------- | ----------- | ------------ | ## Ethics and Safety Ethics and safety evaluation approach and results. ### Evaluation Approach Our evaluation methods include structured evaluations and internal red-teaming testing of relevant content policies. Red-teaming was conducted by a number of different teams, each with different goals and human evaluation metrics. These models were evaluated against a number of different categories relevant to ethics and safety, including: * Text-to-Text Content Safety: Human evaluation on prompts covering safety policies including child sexual abuse and exploitation, harassment, violence and gore, and hate speech. * Text-to-Text Representational Harms: Benchmark against relevant academic datasets such as [WinoBias][winobias] and [BBQ Dataset][bbq]. * Memorization: Automated evaluation of memorization of training data, including the risk of personally identifiable information exposure. * Large-scale harm: Tests for \"dangerous capabilities,\" such as chemical, biological, radiological, and nuclear (CBRN) risks. ### Evaluation Results The results of ethics and safety evaluations are within acceptable thresholds for meeting [internal policies][safety-policies] for categories such as child safety, content safety, representational harms, memorization, large-scale harms. On top of robust internal evaluations, the results of well-known safety benchmarks like BBQ, BOLD, Winogender, Winobias, RealToxicity, and TruthfulQA are shown here. #### Gemma 2.0 | Benchmark | Metric | Gemma 2 IT 9B | Gemma 2 IT 27B | | ------------------------ | ------------- | --------------- | ---------------- | | [RealToxicity][realtox] | average | 8.25 | 8.84 | | [CrowS-Pairs][crows] | top-1 | 37.47 | 36.67 | | [BBQ Ambig][bbq] | 1-shot, top-1 | 88.58 | 85.99 | | [BBQ Disambig][bbq] | top-1 | 82.67 | 86.94 | | [Winogender][winogender] | top-1 | 79.17 | 77.22 | | [TruthfulQA][truthfulqa] | | 50.27 | 51.60 | | [Winobias 1_2][winobias] | | 78.09 | 81.94 | | [Winobias 2_2][winobias] | | 95.32 | 97.22 | | [Toxigen][toxigen] | | 39.30 | 38.42 | | ------------------------ | ------------- | --------------- | ---------------- | ## Usage and Limitations These models have certain limitations that users should be aware of. ### Intended Usage Open Large Language Models (LLMs) have a wide range of applications across various industries and domains. The following list of potential uses is not comprehensive. The purpose of this list is to provide contextual information about the possible use-cases that the model creators considered as part of model training and development. * Content Creation and Communication * Text Generation: These models can be used to generate creative text formats such as poems, scripts, code, marketing copy, and email drafts. * Chatbots and Conversational AI: Power conversational interfaces for customer service, virtual assistants, or interactive applications. * Text Summarization: Generate concise summaries of a text corpus, research papers, or reports. * Research and Education * Natural Language Processing (NLP) Research: These models can serve as a foundation for researchers to experiment with NLP techniques, develop algorithms, and contribute to the advancement of the field. * Language Learning Tools: Support interactive language learning experiences, aiding in grammar correction or providing writing practice. * Knowledge Exploration: Assist researchers in exploring large bodies of text by generating summaries or answering questions about specific topics. ### Limitations * Training Data * The quality and diversity of the training data significantly influence the model's capabilities. Biases or gaps in the training data can lead to limitations in the model's responses. * The scope of the training dataset determines the subject areas the model can handle effectively. * Context and Task Complexity * LLMs are better at tasks that can be framed with clear prompts and instructions. Open-ended or highly complex tasks might be challenging. * A model's performance can be influenced by the amount of context provided (longer context generally leads to better outputs, up to a certain point). * Language Ambiguity and Nuance * Natural language is inherently complex. LLMs might struggle to grasp subtle nuances, sarcasm, or figurative language. * Factual Accuracy * LLMs generate responses based on information they learned from their training datasets, but they are not knowledge bases. They may generate incorrect or outdated factual statements. * Common Sense * LLMs rely on statistical patterns in language. They might lack the ability to apply common sense reasoning in certain situations. ### Ethical Considerations and Risks The development of large language models (LLMs) raises several ethical concerns. In creating an open model, we have carefully considered the following: * Bias and Fairness * LLMs trained on large-scale, real-world text data can reflect socio-cultural biases embedded in the training material. These models underwent careful scrutiny, input data pre-processing described and posterior evaluations reported in this card. * Misinformation and Misuse * LLMs can be misused to generate text that is false, misleading, or harmful. * Guidelines are provided for responsible use with the model, see the [Responsible Generative AI Toolkit][rai-toolkit]. * Transparency and Accountability: * This model card summarizes details on the models' architecture, capabilities, limitations, and evaluation processes. * A responsibly developed open model offers the opportunity to share innovation by making LLM technology accessible to developers and researchers across the AI ecosystem. Risks identified and mitigations: * Perpetuation of biases: It's encouraged to perform continuous monitoring (using evaluation metrics, human review) and the exploration of de-biasing techniques during model training, fine-tuning, and other use cases. * Generation of harmful content: Mechanisms and guidelines for content safety are essential. Developers are encouraged to exercise caution and implement appropriate content safety safeguards based on their specific product policies and application use cases. * Misuse for malicious purposes: Technical limitations and developer and end-user education can help mitigate against malicious applications of LLMs. Educational resources and reporting mechanisms for users to flag misuse are provided. Prohibited uses of Gemma models are outlined in the [Gemma Prohibited Use Policy][prohibited-use]. * Privacy violations: Models were trained on data filtered for removal of PII (Personally Identifiable Information). Developers are encouraged to adhere to privacy regulations with privacy-preserving techniques. ### Benefits At the time of release, this family of models provides high-performance open large language model implementations designed from the ground up for Responsible AI development compared to similarly sized models. Using the benchmark evaluation metrics described in this document, these models have shown to provide superior performance to other, comparably-sized open model alternatives. [rai-toolkit]: [kaggle-gemma]: [terms]: [vertex-mg-gemma]: [sensitive-info]: [safety-policies]: [prohibited-use]: [tpu]: [sustainability]: [jax]: [ml-pathways]: [sustainability]: [foundation-models]: [gemini-2-paper]: [mmlu]: [hellaswag]: [piqa]: [socialiqa]: [boolq]: [winogrande]: [commonsenseqa]: [openbookqa]: [arc]: [triviaqa]: [naturalq]: [humaneval]: [mbpp]: [gsm8k]: [realtox]: [bold]: [crows]: [bbq]: [winogender]: [truthfulqa]: [winobias]: [math]: [agieval]: [big-bench]: [toxigen]:",
+ "model_explanation_gemini": "A lightweight, decoder-only text-generation model designed for tasks like question answering, summarization, and reasoning, optimized for deployment in resource-limited environments.\n\n**Features:** \n- Instruction-tuned for conversational use \n- Supports English text generation \n- Compatible with single/multi-GPU setups \n- Offers quantization options (4-bit, 8-bit) \n- Includes chat template for structured dialogue \n\n**Comparison:** \nSmaller and more deployable than Gemini models but built from similar research, prioritizing accessibility"
+}
\ No newline at end of file
diff --git a/model_data_json/google_gemma-2b.json b/model_data_json/google_gemma-2b.json
new file mode 100644
index 0000000000000000000000000000000000000000..de618743a8d651de5cd26344839fb4d9de88851e
--- /dev/null
+++ b/model_data_json/google_gemma-2b.json
@@ -0,0 +1,41 @@
+{
+ "model_id": "google/gemma-2b",
+ "downloads": 397089,
+ "tags": [
+ "transformers",
+ "safetensors",
+ "gguf",
+ "gemma",
+ "text-generation",
+ "arxiv:2312.11805",
+ "arxiv:2009.03300",
+ "arxiv:1905.07830",
+ "arxiv:1911.11641",
+ "arxiv:1904.09728",
+ "arxiv:1905.10044",
+ "arxiv:1907.10641",
+ "arxiv:1811.00937",
+ "arxiv:1809.02789",
+ "arxiv:1911.01547",
+ "arxiv:1705.03551",
+ "arxiv:2107.03374",
+ "arxiv:2108.07732",
+ "arxiv:2110.14168",
+ "arxiv:2304.06364",
+ "arxiv:2206.04615",
+ "arxiv:1804.06876",
+ "arxiv:2110.08193",
+ "arxiv:2009.11462",
+ "arxiv:2101.11718",
+ "arxiv:1804.09301",
+ "arxiv:2109.07958",
+ "arxiv:2203.09509",
+ "license:gemma",
+ "autotrain_compatible",
+ "text-generation-inference",
+ "endpoints_compatible",
+ "region:us"
+ ],
+ "description": "--- library_name: transformers new_version: google/gemma-2-2b license: gemma extra_gated_heading: Access Gemma on Hugging Face extra_gated_prompt: To access Gemma on Hugging Face, you’re required to review and agree to Google’s usage license. To do this, please ensure you’re logged-in to Hugging Face and click below. Requests are processed immediately. extra_gated_button_content: Acknowledge license --- # Gemma Model Card **Model Page**: Gemma This model card corresponds to the 2B base version of the Gemma model. You can also visit the model card of the 7B base model, 7B instruct model, and 2B instruct model. **Resources and Technical Documentation**: * Gemma Technical Report * Responsible Generative AI Toolkit * Gemma on Kaggle * Gemma on Vertex Model Garden **Terms of Use**: Terms **Authors**: Google ## Model Information Summary description and brief definition of inputs and outputs. ### Description Gemma is a family of lightweight, state-of-the-art open models from Google, built from the same research and technology used to create the Gemini models. They are text-to-text, decoder-only large language models, available in English, with open weights, pre-trained variants, and instruction-tuned variants. Gemma models are well-suited for a variety of text generation tasks, including question answering, summarization, and reasoning. Their relatively small size makes it possible to deploy them in environments with limited resources such as a laptop, desktop or your own cloud infrastructure, democratizing access to state of the art AI models and helping foster innovation for everyone. ### Context Length Models are trained on a context length of 8192 tokens. ### Usage Below we share some code snippets on how to get quickly started with running the model. First make sure to , then copy the snippet from the section that is relevant for your usecase. #### Fine-tuning the model You can find fine-tuning scripts and notebook under the directory of []( repository. To adapt it to this model, simply change the model-id to . In that repository, we provide: * A script to perform Supervised Fine-Tuning (SFT) on UltraChat dataset using QLoRA * A script to perform SFT using FSDP on TPU devices * A notebook that you can run on a free-tier Google Colab instance to perform SFT on English quotes dataset #### Running the model on a CPU #### Running the model on a single / multi GPU #### Running the model on a GPU using different precisions * _Using _ * _Using _ #### Quantized Versions through * _Using 8-bit precision (int8)_ * _Using 4-bit precision_ #### Other optimizations * _Flash Attention 2_ First make sure to install in your environment ### Inputs and outputs * **Input:** Text string, such as a question, a prompt, or a document to be summarized. * **Output:** Generated English-language text in response to the input, such as an answer to a question, or a summary of a document. ## Model Data Data used for model training and how the data was processed. ### Training Dataset These models were trained on a dataset of text data that includes a wide variety of sources, totaling 6 trillion tokens. Here are the key components: * Web Documents: A diverse collection of web text ensures the model is exposed to a broad range of linguistic styles, topics, and vocabulary. Primarily English-language content. * Code: Exposing the model to code helps it to learn the syntax and patterns of programming languages, which improves its ability to generate code or understand code-related questions. * Mathematics: Training on mathematical text helps the model learn logical reasoning, symbolic representation, and to address mathematical queries. The combination of these diverse data sources is crucial for training a powerful language model that can handle a wide variety of different tasks and text formats. ### Data Preprocessing Here are the key data cleaning and filtering methods applied to the training data: * CSAM Filtering: Rigorous CSAM (Child Sexual Abuse Material) filtering was applied at multiple stages in the data preparation process to ensure the exclusion of harmful and illegal content * Sensitive Data Filtering: As part of making Gemma pre-trained models safe and reliable, automated techniques were used to filter out certain personal information and other sensitive data from training sets. * Additional methods: Filtering based on content quality and safely in line with our policies. ## Implementation Information Details about the model internals. ### Hardware Gemma was trained using the latest generation of Tensor Processing Unit (TPU) hardware (TPUv5e). Training large language models requires significant computational power. TPUs, designed specifically for matrix operations common in machine learning, offer several advantages in this domain: * Performance: TPUs are specifically designed to handle the massive computations involved in training LLMs. They can speed up training considerably compared to CPUs. * Memory: TPUs often come with large amounts of high-bandwidth memory, allowing for the handling of large models and batch sizes during training. This can lead to better model quality. * Scalability: TPU Pods (large clusters of TPUs) provide a scalable solution for handling the growing complexity of large foundation models. You can distribute training across multiple TPU devices for faster and more efficient processing. * Cost-effectiveness: In many scenarios, TPUs can provide a more cost-effective solution for training large models compared to CPU-based infrastructure, especially when considering the time and resources saved due to faster training. * These advantages are aligned with Google's commitments to operate sustainably. ### Software Training was done using JAX and ML Pathways. JAX allows researchers to take advantage of the latest generation of hardware, including TPUs, for faster and more efficient training of large models. ML Pathways is Google's latest effort to build artificially intelligent systems capable of generalizing across multiple tasks. This is specially suitable for foundation models, including large language models like these ones. Together, JAX and ML Pathways are used as described in the paper about the Gemini family of models; \"the 'single controller' programming model of Jax and Pathways allows a single Python process to orchestrate the entire training run, dramatically simplifying the development workflow.\" ## Evaluation Model evaluation metrics and results. ### Benchmark Results These models were evaluated against a large collection of different datasets and metrics to cover different aspects of text generation: | Benchmark | Metric | 2B Params | 7B Params | | ------------------------------ | ------------- | ----------- | --------- | | MMLU | 5-shot, top-1 | 42.3 | 64.3 | | HellaSwag | 0-shot |71.4 | 81.2 | | PIQA | 0-shot | 77.3 | 81.2 | | SocialIQA | 0-shot | 49.7 | 51.8 | | BooIQ | 0-shot | 69.4 | 83.2 | | WinoGrande | partial score | 65.4 | 72.3 | | CommonsenseQA | 7-shot | 65.3 | 71.3 | | OpenBookQA | | 47.8 | 52.8 | | ARC-e | | 73.2 | 81.5 | | ARC-c | | 42.1 | 53.2 | | TriviaQA | 5-shot | 53.2 | 63.4 | | Natural Questions | 5-shot | 12.5 | 23 | | HumanEval | pass@1 | 22.0 | 32.3 | | MBPP | 3-shot | 29.2 | 44.4 | | GSM8K | maj@1 | 17.7 | 46.4 | | MATH | 4-shot | 11.8 | 24.3 | | AGIEval | | 24.2 | 41.7 | | BIG-Bench | | 35.2 | 55.1 | | ------------------------------ | ------------- | ----------- | --------- | | **Average** | | **45.0** | **56.9** | ## Ethics and Safety Ethics and safety evaluation approach and results. ### Evaluation Approach Our evaluation methods include structured evaluations and internal red-teaming testing of relevant content policies. Red-teaming was conducted by a number of different teams, each with different goals and human evaluation metrics. These models were evaluated against a number of different categories relevant to ethics and safety, including: * Text-to-Text Content Safety: Human evaluation on prompts covering safety policies including child sexual abuse and exploitation, harassment, violence and gore, and hate speech. * Text-to-Text Representational Harms: Benchmark against relevant academic datasets such as WinoBias and BBQ Dataset. * Memorization: Automated evaluation of memorization of training data, including the risk of personally identifiable information exposure. * Large-scale harm: Tests for \"dangerous capabilities,\" such as chemical, biological, radiological, and nuclear (CBRN) risks. ### Evaluation Results The results of ethics and safety evaluations are within acceptable thresholds for meeting internal policies for categories such as child safety, content safety, representational harms, memorization, large-scale harms. On top of robust internal evaluations, the results of well known safety benchmarks like BBQ, BOLD, Winogender, Winobias, RealToxicity, and TruthfulQA are shown here. **Update**: These numbers reflect the new numbers from the updated v1.1 IT models. For the original v1 numbers, please consult the technical report's appendix for the results. | Benchmark | Metric | Gemma v1.1 IT 2B | Gemma v1.1 IT 7B | | ------------------------------ | ------------- | ----------- | --------- | | RealToxicity | average | 6.86 | 7.90 | | BOLD | | 45.57 | 49.08 | | CrowS-Pairs | top-1 | 45.82 | 51.33 | | BBQ Ambig | 1-shot, top-1 | 62.58 | 92.54 | | BBQ Disambig | top-1 | 54.62 | 71.99 | | Winogender | top-1 | 51.25 | 54.17 | | TruthfulQA | | 31.81 | 44.84 | | Winobias 1_2 | | 56.12 | 59.09 | | Winobias 2_2 | | 91.10 | 92.23 | | Toxigen | | 29.77 | 39.59 | | ------------------------------ | ------------- | ----------- | --------- | ## Usage and Limitations These models have certain limitations that users should be aware of. ### Intended Usage Open Large Language Models (LLMs) have a wide range of applications across various industries and domains. The following list of potential uses is not comprehensive. The purpose of this list is to provide contextual information about the possible use-cases that the model creators considered as part of model training and development. * Content Creation and Communication * Text Generation: These models can be used to generate creative text formats such as poems, scripts, code, marketing copy, and email drafts. * Chatbots and Conversational AI: Power conversational interfaces for customer service, virtual assistants, or interactive applications. * Text Summarization: Generate concise summaries of a text corpus, research papers, or reports. * Research and Education * Natural Language Processing (NLP) Research: These models can serve as a foundation for researchers to experiment with NLP techniques, develop algorithms, and contribute to the advancement of the field. * Language Learning Tools: Support interactive language learning experiences, aiding in grammar correction or providing writing practice. * Knowledge Exploration: Assist researchers in exploring large bodies of text by generating summaries or answering questions about specific topics. ### Limitations * Training Data * The quality and diversity of the training data significantly influence the model's capabilities. Biases or gaps in the training data can lead to limitations in the model's responses. * The scope of the training dataset determines the subject areas the model can handle effectively. * Context and Task Complexity * LLMs are better at tasks that can be framed with clear prompts and instructions. Open-ended or highly complex tasks might be challenging. * A model's performance can be influenced by the amount of context provided (longer context generally leads to better outputs, up to a certain point). * Language Ambiguity and Nuance * Natural language is inherently complex. LLMs might struggle to grasp subtle nuances, sarcasm, or figurative language. * Factual Accuracy * LLMs generate responses based on information they learned from their training datasets, but they are not knowledge bases. They may generate incorrect or outdated factual statements. * Common Sense * LLMs rely on statistical patterns in language. They might lack the ability to apply common sense reasoning in certain situations. ### Ethical Considerations and Risks The development of large language models (LLMs) raises several ethical concerns. In creating an open model, we have carefully considered the following: * Bias and Fairness * LLMs trained on large-scale, real-world text data can reflect socio-cultural biases embedded in the training material. These models underwent careful scrutiny, input data pre-processing described and posterior evaluations reported in this card. * Misinformation and Misuse * LLMs can be misused to generate text that is false, misleading, or harmful. * Guidelines are provided for responsible use with the model, see the Responsible Generative AI Toolkit. * Transparency and Accountability: * This model card summarizes details on the models' architecture, capabilities, limitations, and evaluation processes. * A responsibly developed open model offers the opportunity to share innovation by making LLM technology accessible to developers and researchers across the AI ecosystem. Risks identified and mitigations: * Perpetuation of biases: It's encouraged to perform continuous monitoring (using evaluation metrics, human review) and the exploration of de-biasing techniques during model training, fine-tuning, and other use cases. * Generation of harmful content: Mechanisms and guidelines for content safety are essential. Developers are encouraged to exercise caution and implement appropriate content safety safeguards based on their specific product policies and application use cases. * Misuse for malicious purposes: Technical limitations and developer and end-user education can help mitigate against malicious applications of LLMs. Educational resources and reporting mechanisms for users to flag misuse are provided. Prohibited uses of Gemma models are outlined in the Gemma Prohibited Use Policy. * Privacy violations: Models were trained on data filtered for removal of PII (Personally Identifiable Information). Developers are encouraged to adhere to privacy regulations with privacy-preserving techniques. ### Benefits At the time of release, this family of models provides high-performance open large language model implementations designed from the ground up for Responsible AI development compared to similarly sized models. Using the benchmark evaluation metrics described in this document, these models have shown to provide superior performance to other, comparably-sized open model alternatives.",
+ "model_explanation_gemini": "\"Gemma-2b is a lightweight, decoder-only text generation model designed for tasks like question answering, summarization, and reasoning, optimized for deployment on resource-limited devices.\"\n\nFeatures: \n- Text-to-text generation (English) \n- Decoder-only architecture \n- 2B parameter base model \n- 8192-token context length \n- Trained on diverse datasets (web docs, code, math) \n- Supports fine-tuning and quantization \n\nComparison: \nSmaller than the"
+}
\ No newline at end of file
diff --git a/model_data_json/google_gemma-3-12b-it.json b/model_data_json/google_gemma-3-12b-it.json
new file mode 100644
index 0000000000000000000000000000000000000000..8c665274b65f928365ef061628af4247212e48a7
--- /dev/null
+++ b/model_data_json/google_gemma-3-12b-it.json
@@ -0,0 +1,47 @@
+{
+ "model_id": "google/gemma-3-12b-it",
+ "downloads": 354924,
+ "tags": [
+ "transformers",
+ "safetensors",
+ "gemma3",
+ "image-text-to-text",
+ "conversational",
+ "arxiv:1905.07830",
+ "arxiv:1905.10044",
+ "arxiv:1911.11641",
+ "arxiv:1904.09728",
+ "arxiv:1705.03551",
+ "arxiv:1911.01547",
+ "arxiv:1907.10641",
+ "arxiv:1903.00161",
+ "arxiv:2009.03300",
+ "arxiv:2304.06364",
+ "arxiv:2103.03874",
+ "arxiv:2110.14168",
+ "arxiv:2311.12022",
+ "arxiv:2108.07732",
+ "arxiv:2107.03374",
+ "arxiv:2210.03057",
+ "arxiv:2106.03193",
+ "arxiv:1910.11856",
+ "arxiv:2502.12404",
+ "arxiv:2502.21228",
+ "arxiv:2404.16816",
+ "arxiv:2104.12756",
+ "arxiv:2311.16502",
+ "arxiv:2203.10244",
+ "arxiv:2404.12390",
+ "arxiv:1810.12440",
+ "arxiv:1908.02660",
+ "arxiv:2312.11805",
+ "base_model:google/gemma-3-12b-pt",
+ "base_model:finetune:google/gemma-3-12b-pt",
+ "license:gemma",
+ "text-generation-inference",
+ "endpoints_compatible",
+ "region:us"
+ ],
+ "description": "--- license: gemma library_name: transformers pipeline_tag: image-text-to-text extra_gated_heading: Access Gemma on Hugging Face extra_gated_prompt: To access Gemma on Hugging Face, you’re required to review and agree to Google’s usage license. To do this, please ensure you’re logged in to Hugging Face and click below. Requests are processed immediately. extra_gated_button_content: Acknowledge license base_model: google/gemma-3-12b-pt --- # Gemma 3 model card **Model Page**: Gemma **Resources and Technical Documentation**: * [Gemma 3 Technical Report][g3-tech-report] * [Responsible Generative AI Toolkit][rai-toolkit] * [Gemma on Kaggle][kaggle-gemma] * [Gemma on Vertex Model Garden][vertex-mg-gemma3] **Terms of Use**: [Terms][terms] **Authors**: Google DeepMind ## Model Information Summary description and brief definition of inputs and outputs. ### Description Gemma is a family of lightweight, state-of-the-art open models from Google, built from the same research and technology used to create the Gemini models. Gemma 3 models are multimodal, handling text and image input and generating text output, with open weights for both pre-trained variants and instruction-tuned variants. Gemma 3 has a large, 128K context window, multilingual support in over 140 languages, and is available in more sizes than previous versions. Gemma 3 models are well-suited for a variety of text generation and image understanding tasks, including question answering, summarization, and reasoning. Their relatively small size makes it possible to deploy them in environments with limited resources such as laptops, desktops or your own cloud infrastructure, democratizing access to state of the art AI models and helping foster innovation for everyone. ### Inputs and outputs - **Input:** - Text string, such as a question, a prompt, or a document to be summarized - Images, normalized to 896 x 896 resolution and encoded to 256 tokens each - Total input context of 128K tokens for the 4B, 12B, and 27B sizes, and 32K tokens for the 1B size - **Output:** - Generated text in response to the input, such as an answer to a question, analysis of image content, or a summary of a document - Total output context of 8192 tokens ### Usage Below, there are some code snippets on how to get quickly started with running the model. First, install the Transformers library. Gemma 3 is supported starting from transformers 4.50.0. Then, copy the snippet from the section that is relevant for your use case. #### Running with the API You can initialize the model and processor for inference with as follows. With instruction-tuned models, you need to use chat templates to process our inputs first. Then, you can pass it to the pipeline. #### Running the model on a single / multi GPU ### Citation ## Model Data Data used for model training and how the data was processed. ### Training Dataset These models were trained on a dataset of text data that includes a wide variety of sources. The 27B model was trained with 14 trillion tokens, the 12B model was trained with 12 trillion tokens, 4B model was trained with 4 trillion tokens and 1B with 2 trillion tokens. Here are the key components: - Web Documents: A diverse collection of web text ensures the model is exposed to a broad range of linguistic styles, topics, and vocabulary. The training dataset includes content in over 140 languages. - Code: Exposing the model to code helps it to learn the syntax and patterns of programming languages, which improves its ability to generate code and understand code-related questions. - Mathematics: Training on mathematical text helps the model learn logical reasoning, symbolic representation, and to address mathematical queries. - Images: A wide range of images enables the model to perform image analysis and visual data extraction tasks. The combination of these diverse data sources is crucial for training a powerful multimodal model that can handle a wide variety of different tasks and data formats. ### Data Preprocessing Here are the key data cleaning and filtering methods applied to the training data: - CSAM Filtering: Rigorous CSAM (Child Sexual Abuse Material) filtering was applied at multiple stages in the data preparation process to ensure the exclusion of harmful and illegal content. - Sensitive Data Filtering: As part of making Gemma pre-trained models safe and reliable, automated techniques were used to filter out certain personal information and other sensitive data from training sets. - Additional methods: Filtering based on content quality and safety in line with [our policies][safety-policies]. ## Implementation Information Details about the model internals. ### Hardware Gemma was trained using [Tensor Processing Unit (TPU)][tpu] hardware (TPUv4p, TPUv5p and TPUv5e). Training vision-language models (VLMS) requires significant computational power. TPUs, designed specifically for matrix operations common in machine learning, offer several advantages in this domain: - Performance: TPUs are specifically designed to handle the massive computations involved in training VLMs. They can speed up training considerably compared to CPUs. - Memory: TPUs often come with large amounts of high-bandwidth memory, allowing for the handling of large models and batch sizes during training. This can lead to better model quality. - Scalability: TPU Pods (large clusters of TPUs) provide a scalable solution for handling the growing complexity of large foundation models. You can distribute training across multiple TPU devices for faster and more efficient processing. - Cost-effectiveness: In many scenarios, TPUs can provide a more cost-effective solution for training large models compared to CPU-based infrastructure, especially when considering the time and resources saved due to faster training. - These advantages are aligned with [Google's commitments to operate sustainably][sustainability]. ### Software Training was done using [JAX][jax] and [ML Pathways][ml-pathways]. JAX allows researchers to take advantage of the latest generation of hardware, including TPUs, for faster and more efficient training of large models. ML Pathways is Google's latest effort to build artificially intelligent systems capable of generalizing across multiple tasks. This is specially suitable for foundation models, including large language models like these ones. Together, JAX and ML Pathways are used as described in the [paper about the Gemini family of models][gemini-2-paper]; *\"the 'single controller' programming model of Jax and Pathways allows a single Python process to orchestrate the entire training run, dramatically simplifying the development workflow.\"* ## Evaluation Model evaluation metrics and results. ### Benchmark Results These models were evaluated against a large collection of different datasets and metrics to cover different aspects of text generation: #### Reasoning and factuality | Benchmark | Metric | Gemma 3 PT 1B | Gemma 3 PT 4B | Gemma 3 PT 12B | Gemma 3 PT 27B | | ------------------------------ |----------------|:--------------:|:-------------:|:--------------:|:--------------:| | [HellaSwag][hellaswag] | 10-shot | 62.3 | 77.2 | 84.2 | 85.6 | | [BoolQ][boolq] | 0-shot | 63.2 | 72.3 | 78.8 | 82.4 | | [PIQA][piqa] | 0-shot | 73.8 | 79.6 | 81.8 | 83.3 | | [SocialIQA][socialiqa] | 0-shot | 48.9 | 51.9 | 53.4 | 54.9 | | [TriviaQA][triviaqa] | 5-shot | 39.8 | 65.8 | 78.2 | 85.5 | | [Natural Questions][naturalq] | 5-shot | 9.48 | 20.0 | 31.4 | 36.1 | | [ARC-c][arc] | 25-shot | 38.4 | 56.2 | 68.9 | 70.6 | | [ARC-e][arc] | 0-shot | 73.0 | 82.4 | 88.3 | 89.0 | | [WinoGrande][winogrande] | 5-shot | 58.2 | 64.7 | 74.3 | 78.8 | | [BIG-Bench Hard][bbh] | few-shot | 28.4 | 50.9 | 72.6 | 77.7 | | [DROP][drop] | 1-shot | 42.4 | 60.1 | 72.2 | 77.2 | [hellaswag]: [boolq]: [piqa]: [socialiqa]: [triviaqa]: [naturalq]: [arc]: [winogrande]: [bbh]: [drop]: #### STEM and code | Benchmark | Metric | Gemma 3 PT 4B | Gemma 3 PT 12B | Gemma 3 PT 27B | | ------------------------------ |----------------|:-------------:|:--------------:|:--------------:| | [MMLU][mmlu] | 5-shot | 59.6 | 74.5 | 78.6 | | [MMLU][mmlu] (Pro COT) | 5-shot | 29.2 | 45.3 | 52.2 | | [AGIEval][agieval] | 3-5-shot | 42.1 | 57.4 | 66.2 | | [MATH][math] | 4-shot | 24.2 | 43.3 | 50.0 | | [GSM8K][gsm8k] | 8-shot | 38.4 | 71.0 | 82.6 | | [GPQA][gpqa] | 5-shot | 15.0 | 25.4 | 24.3 | | [MBPP][mbpp] | 3-shot | 46.0 | 60.4 | 65.6 | | [HumanEval][humaneval] | 0-shot | 36.0 | 45.7 | 48.8 | [mmlu]: [agieval]: [math]: [gsm8k]: [gpqa]: [mbpp]: [humaneval]: #### Multilingual | Benchmark | Gemma 3 PT 1B | Gemma 3 PT 4B | Gemma 3 PT 12B | Gemma 3 PT 27B | | ------------------------------------ |:-------------:|:-------------:|:--------------:|:--------------:| | [MGSM][mgsm] | 2.04 | 34.7 | 64.3 | 74.3 | | [Global-MMLU-Lite][global-mmlu-lite] | 24.9 | 57.0 | 69.4 | 75.7 | | [WMT24++][wmt24pp] (ChrF) | 36.7 | 48.4 | 53.9 | 55.7 | | [FloRes][flores] | 29.5 | 39.2 | 46.0 | 48.8 | | [XQuAD][xquad] (all) | 43.9 | 68.0 | 74.5 | 76.8 | | [ECLeKTic][eclektic] | 4.69 | 11.0 | 17.2 | 24.4 | | [IndicGenBench][indicgenbench] | 41.4 | 57.2 | 61.7 | 63.4 | [mgsm]: [flores]: [xquad]: [global-mmlu-lite]: [wmt24pp]: [eclektic]: [indicgenbench]: #### Multimodal | Benchmark | Gemma 3 PT 4B | Gemma 3 PT 12B | Gemma 3 PT 27B | | ------------------------------ |:-------------:|:--------------:|:--------------:| | [COCOcap][coco-cap] | 102 | 111 | 116 | | [DocVQA][docvqa] (val) | 72.8 | 82.3 | 85.6 | | [InfoVQA][info-vqa] (val) | 44.1 | 54.8 | 59.4 | | [MMMU][mmmu] (pt) | 39.2 | 50.3 | 56.1 | | [TextVQA][textvqa] (val) | 58.9 | 66.5 | 68.6 | | [RealWorldQA][realworldqa] | 45.5 | 52.2 | 53.9 | | [ReMI][remi] | 27.3 | 38.5 | 44.8 | | [AI2D][ai2d] | 63.2 | 75.2 | 79.0 | | [ChartQA][chartqa] | 63.6 | 74.7 | 76.3 | | [VQAv2][vqav2] | 63.9 | 71.2 | 72.9 | | [BLINK][blinkvqa] | 38.0 | 35.9 | 39.6 | | [OKVQA][okvqa] | 51.0 | 58.7 | 60.2 | | [TallyQA][tallyqa] | 42.5 | 51.8 | 54.3 | | [SpatialSense VQA][ss-vqa] | 50.9 | 60.0 | 59.4 | | [CountBenchQA][countbenchqa] | 26.1 | 17.8 | 68.0 | [coco-cap]: [docvqa]: [info-vqa]: [mmmu]: [textvqa]: [realworldqa]: [remi]: [ai2d]: [chartqa]: [vqav2]: [blinkvqa]: [okvqa]: [tallyqa]: [ss-vqa]: [countbenchqa]: ## Ethics and Safety Ethics and safety evaluation approach and results. ### Evaluation Approach Our evaluation methods include structured evaluations and internal red-teaming testing of relevant content policies. Red-teaming was conducted by a number of different teams, each with different goals and human evaluation metrics. These models were evaluated against a number of different categories relevant to ethics and safety, including: - **Child Safety**: Evaluation of text-to-text and image to text prompts covering child safety policies, including child sexual abuse and exploitation. - **Content Safety:** Evaluation of text-to-text and image to text prompts covering safety policies including, harassment, violence and gore, and hate speech. - **Representational Harms**: Evaluation of text-to-text and image to text prompts covering safety policies including bias, stereotyping, and harmful associations or inaccuracies. In addition to development level evaluations, we conduct \"assurance evaluations\" which are our 'arms-length' internal evaluations for responsibility governance decision making. They are conducted separately from the model development team, to inform decision making about release. High level findings are fed back to the model team, but prompt sets are held-out to prevent overfitting and preserve the results' ability to inform decision making. Assurance evaluation results are reported to our Responsibility & Safety Council as part of release review. ### Evaluation Results For all areas of safety testing, we saw major improvements in the categories of child safety, content safety, and representational harms relative to previous Gemma models. All testing was conducted without safety filters to evaluate the model capabilities and behaviors. For both text-to-text and image-to-text, and across all model sizes, the model produced minimal policy violations, and showed significant improvements over previous Gemma models' performance with respect to ungrounded inferences. A limitation of our evaluations was they included only English language prompts. ## Usage and Limitations These models have certain limitations that users should be aware of. ### Intended Usage Open vision-language models (VLMs) models have a wide range of applications across various industries and domains. The following list of potential uses is not comprehensive. The purpose of this list is to provide contextual information about the possible use-cases that the model creators considered as part of model training and development. - Content Creation and Communication - Text Generation: These models can be used to generate creative text formats such as poems, scripts, code, marketing copy, and email drafts. - Chatbots and Conversational AI: Power conversational interfaces for customer service, virtual assistants, or interactive applications. - Text Summarization: Generate concise summaries of a text corpus, research papers, or reports. - Image Data Extraction: These models can be used to extract, interpret, and summarize visual data for text communications. - Research and Education - Natural Language Processing (NLP) and VLM Research: These models can serve as a foundation for researchers to experiment with VLM and NLP techniques, develop algorithms, and contribute to the advancement of the field. - Language Learning Tools: Support interactive language learning experiences, aiding in grammar correction or providing writing practice. - Knowledge Exploration: Assist researchers in exploring large bodies of text by generating summaries or answering questions about specific topics. ### Limitations - Training Data - The quality and diversity of the training data significantly influence the model's capabilities. Biases or gaps in the training data can lead to limitations in the model's responses. - The scope of the training dataset determines the subject areas the model can handle effectively. - Context and Task Complexity - Models are better at tasks that can be framed with clear prompts and instructions. Open-ended or highly complex tasks might be challenging. - A model's performance can be influenced by the amount of context provided (longer context generally leads to better outputs, up to a certain point). - Language Ambiguity and Nuance - Natural language is inherently complex. Models might struggle to grasp subtle nuances, sarcasm, or figurative language. - Factual Accuracy - Models generate responses based on information they learned from their training datasets, but they are not knowledge bases. They may generate incorrect or outdated factual statements. - Common Sense - Models rely on statistical patterns in language. They might lack the ability to apply common sense reasoning in certain situations. ### Ethical Considerations and Risks The development of vision-language models (VLMs) raises several ethical concerns. In creating an open model, we have carefully considered the following: - Bias and Fairness - VLMs trained on large-scale, real-world text and image data can reflect socio-cultural biases embedded in the training material. These models underwent careful scrutiny, input data pre-processing described and posterior evaluations reported in this card. - Misinformation and Misuse - VLMs can be misused to generate text that is false, misleading, or harmful. - Guidelines are provided for responsible use with the model, see the [Responsible Generative AI Toolkit][rai-toolkit]. - Transparency and Accountability: - This model card summarizes details on the models' architecture, capabilities, limitations, and evaluation processes. - A responsibly developed open model offers the opportunity to share innovation by making VLM technology accessible to developers and researchers across the AI ecosystem. Risks identified and mitigations: - **Perpetuation of biases**: It's encouraged to perform continuous monitoring (using evaluation metrics, human review) and the exploration of de-biasing techniques during model training, fine-tuning, and other use cases. - **Generation of harmful content**: Mechanisms and guidelines for content safety are essential. Developers are encouraged to exercise caution and implement appropriate content safety safeguards based on their specific product policies and application use cases. - **Misuse for malicious purposes**: Technical limitations and developer and end-user education can help mitigate against malicious applications of VLMs. Educational resources and reporting mechanisms for users to flag misuse are provided. Prohibited uses of Gemma models are outlined in the [Gemma Prohibited Use Policy][prohibited-use]. - **Privacy violations**: Models were trained on data filtered for removal of certain personal information and other sensitive data. Developers are encouraged to adhere to privacy regulations with privacy-preserving techniques. ### Benefits At the time of release, this family of models provides high-performance open vision-language model implementations designed from the ground up for responsible AI development compared to similarly sized models. Using the benchmark evaluation metrics described in this document, these models have shown to provide superior performance to other, comparably-sized open model alternatives. [g3-tech-report]: [rai-toolkit]: [kaggle-gemma]: [vertex-mg-gemma3]: [terms]: [safety-policies]: [prohibited-use]: [tpu]: [sustainability]: [jax]: [ml-pathways]: [sustainability]: [gemini-2-paper]:",
+ "model_explanation_gemini": "A lightweight, multimodal model handling text and image inputs to generate text outputs, optimized for tasks like question answering, summarization, and reasoning with a 128K context window and multilingual support.\n\n**Features:** \n- Multimodal (text and image inputs, text outputs) \n- 128K token context window (12B size) \n- Supports 140+ languages \n- Instruction-tuned variants available \n- Optimized for resource-limited deployment \n\n**Comparison:** \nGemma 3 offers"
+}
\ No newline at end of file
diff --git a/model_data_json/google_gemma-3-1b-it.json b/model_data_json/google_gemma-3-1b-it.json
new file mode 100644
index 0000000000000000000000000000000000000000..30aa3cb03220fe85ad36292439302651beddb856
--- /dev/null
+++ b/model_data_json/google_gemma-3-1b-it.json
@@ -0,0 +1,47 @@
+{
+ "model_id": "google/gemma-3-1b-it",
+ "downloads": 2410194,
+ "tags": [
+ "transformers",
+ "safetensors",
+ "gemma3_text",
+ "text-generation",
+ "conversational",
+ "arxiv:1905.07830",
+ "arxiv:1905.10044",
+ "arxiv:1911.11641",
+ "arxiv:1904.09728",
+ "arxiv:1705.03551",
+ "arxiv:1911.01547",
+ "arxiv:1907.10641",
+ "arxiv:1903.00161",
+ "arxiv:2009.03300",
+ "arxiv:2304.06364",
+ "arxiv:2103.03874",
+ "arxiv:2110.14168",
+ "arxiv:2311.12022",
+ "arxiv:2108.07732",
+ "arxiv:2107.03374",
+ "arxiv:2210.03057",
+ "arxiv:2106.03193",
+ "arxiv:1910.11856",
+ "arxiv:2502.12404",
+ "arxiv:2502.21228",
+ "arxiv:2404.16816",
+ "arxiv:2104.12756",
+ "arxiv:2311.16502",
+ "arxiv:2203.10244",
+ "arxiv:2404.12390",
+ "arxiv:1810.12440",
+ "arxiv:1908.02660",
+ "arxiv:2312.11805",
+ "base_model:google/gemma-3-1b-pt",
+ "base_model:finetune:google/gemma-3-1b-pt",
+ "license:gemma",
+ "autotrain_compatible",
+ "endpoints_compatible",
+ "region:us"
+ ],
+ "description": "--- license: gemma library_name: transformers pipeline_tag: text-generation extra_gated_heading: Access Gemma on Hugging Face extra_gated_prompt: To access Gemma on Hugging Face, you’re required to review and agree to Google’s usage license. To do this, please ensure you’re logged in to Hugging Face and click below. Requests are processed immediately. extra_gated_button_content: Acknowledge license base_model: google/gemma-3-1b-pt --- # Gemma 3 model card **Model Page**: Gemma **Resources and Technical Documentation**: * [Gemma 3 Technical Report][g3-tech-report] * [Responsible Generative AI Toolkit][rai-toolkit] * [Gemma on Kaggle][kaggle-gemma] * [Gemma on Vertex Model Garden][vertex-mg-gemma3] **Terms of Use**: [Terms][terms] **Authors**: Google DeepMind ## Model Information Summary description and brief definition of inputs and outputs. ### Description Gemma is a family of lightweight, state-of-the-art open models from Google, built from the same research and technology used to create the Gemini models. Gemma 3 models are multimodal, handling text and image input and generating text output, with open weights for both pre-trained variants and instruction-tuned variants. Gemma 3 has a large, 128K context window, multilingual support in over 140 languages, and is available in more sizes than previous versions. Gemma 3 models are well-suited for a variety of text generation and image understanding tasks, including question answering, summarization, and reasoning. Their relatively small size makes it possible to deploy them in environments with limited resources such as laptops, desktops or your own cloud infrastructure, democratizing access to state of the art AI models and helping foster innovation for everyone. ### Inputs and outputs - **Input:** - Text string, such as a question, a prompt, or a document to be summarized - Images, normalized to 896 x 896 resolution and encoded to 256 tokens each - Total input context of 128K tokens for the 4B, 12B, and 27B sizes, and 32K tokens for the 1B size - **Output:** - Generated text in response to the input, such as an answer to a question, analysis of image content, or a summary of a document - Total output context of 8192 tokens ### Usage Below, there are some code snippets on how to get quickly started with running the model. First, install the Transformers library. Gemma 3 is supported starting from transformers 4.50.0. Then, copy the snippet from the section that is relevant for your use case. #### Running with the API With instruction-tuned models, you need to use chat templates to process our inputs first. Then, you can pass it to the pipeline. #### Running the model on a single / multi GPU ### Citation ## Model Data Data used for model training and how the data was processed. ### Training Dataset These models were trained on a dataset of text data that includes a wide variety of sources. The 27B model was trained with 14 trillion tokens, the 12B model was trained with 12 trillion tokens, 4B model was trained with 4 trillion tokens and 1B with 2 trillion tokens. Here are the key components: - Web Documents: A diverse collection of web text ensures the model is exposed to a broad range of linguistic styles, topics, and vocabulary. The training dataset includes content in over 140 languages. - Code: Exposing the model to code helps it to learn the syntax and patterns of programming languages, which improves its ability to generate code and understand code-related questions. - Mathematics: Training on mathematical text helps the model learn logical reasoning, symbolic representation, and to address mathematical queries. - Images: A wide range of images enables the model to perform image analysis and visual data extraction tasks. The combination of these diverse data sources is crucial for training a powerful multimodal model that can handle a wide variety of different tasks and data formats. ### Data Preprocessing Here are the key data cleaning and filtering methods applied to the training data: - CSAM Filtering: Rigorous CSAM (Child Sexual Abuse Material) filtering was applied at multiple stages in the data preparation process to ensure the exclusion of harmful and illegal content. - Sensitive Data Filtering: As part of making Gemma pre-trained models safe and reliable, automated techniques were used to filter out certain personal information and other sensitive data from training sets. - Additional methods: Filtering based on content quality and safety in line with [our policies][safety-policies]. ## Implementation Information Details about the model internals. ### Hardware Gemma was trained using [Tensor Processing Unit (TPU)][tpu] hardware (TPUv4p, TPUv5p and TPUv5e). Training vision-language models (VLMS) requires significant computational power. TPUs, designed specifically for matrix operations common in machine learning, offer several advantages in this domain: - Performance: TPUs are specifically designed to handle the massive computations involved in training VLMs. They can speed up training considerably compared to CPUs. - Memory: TPUs often come with large amounts of high-bandwidth memory, allowing for the handling of large models and batch sizes during training. This can lead to better model quality. - Scalability: TPU Pods (large clusters of TPUs) provide a scalable solution for handling the growing complexity of large foundation models. You can distribute training across multiple TPU devices for faster and more efficient processing. - Cost-effectiveness: In many scenarios, TPUs can provide a more cost-effective solution for training large models compared to CPU-based infrastructure, especially when considering the time and resources saved due to faster training. - These advantages are aligned with [Google's commitments to operate sustainably][sustainability]. ### Software Training was done using [JAX][jax] and [ML Pathways][ml-pathways]. JAX allows researchers to take advantage of the latest generation of hardware, including TPUs, for faster and more efficient training of large models. ML Pathways is Google's latest effort to build artificially intelligent systems capable of generalizing across multiple tasks. This is specially suitable for foundation models, including large language models like these ones. Together, JAX and ML Pathways are used as described in the [paper about the Gemini family of models][gemini-2-paper]; *\"the 'single controller' programming model of Jax and Pathways allows a single Python process to orchestrate the entire training run, dramatically simplifying the development workflow.\"* ## Evaluation Model evaluation metrics and results. ### Benchmark Results These models were evaluated against a large collection of different datasets and metrics to cover different aspects of text generation: #### Reasoning and factuality | Benchmark | Metric | Gemma 3 PT 1B | Gemma 3 PT 4B | Gemma 3 PT 12B | Gemma 3 PT 27B | | ------------------------------ |----------------|:--------------:|:-------------:|:--------------:|:--------------:| | [HellaSwag][hellaswag] | 10-shot | 62.3 | 77.2 | 84.2 | 85.6 | | [BoolQ][boolq] | 0-shot | 63.2 | 72.3 | 78.8 | 82.4 | | [PIQA][piqa] | 0-shot | 73.8 | 79.6 | 81.8 | 83.3 | | [SocialIQA][socialiqa] | 0-shot | 48.9 | 51.9 | 53.4 | 54.9 | | [TriviaQA][triviaqa] | 5-shot | 39.8 | 65.8 | 78.2 | 85.5 | | [Natural Questions][naturalq] | 5-shot | 9.48 | 20.0 | 31.4 | 36.1 | | [ARC-c][arc] | 25-shot | 38.4 | 56.2 | 68.9 | 70.6 | | [ARC-e][arc] | 0-shot | 73.0 | 82.4 | 88.3 | 89.0 | | [WinoGrande][winogrande] | 5-shot | 58.2 | 64.7 | 74.3 | 78.8 | | [BIG-Bench Hard][bbh] | few-shot | 28.4 | 50.9 | 72.6 | 77.7 | | [DROP][drop] | 1-shot | 42.4 | 60.1 | 72.2 | 77.2 | [hellaswag]: [boolq]: [piqa]: [socialiqa]: [triviaqa]: [naturalq]: [arc]: [winogrande]: [bbh]: [drop]: #### STEM and code | Benchmark | Metric | Gemma 3 PT 4B | Gemma 3 PT 12B | Gemma 3 PT 27B | | ------------------------------ |----------------|:-------------:|:--------------:|:--------------:| | [MMLU][mmlu] | 5-shot | 59.6 | 74.5 | 78.6 | | [MMLU][mmlu] (Pro COT) | 5-shot | 29.2 | 45.3 | 52.2 | | [AGIEval][agieval] | 3-5-shot | 42.1 | 57.4 | 66.2 | | [MATH][math] | 4-shot | 24.2 | 43.3 | 50.0 | | [GSM8K][gsm8k] | 8-shot | 38.4 | 71.0 | 82.6 | | [GPQA][gpqa] | 5-shot | 15.0 | 25.4 | 24.3 | | [MBPP][mbpp] | 3-shot | 46.0 | 60.4 | 65.6 | | [HumanEval][humaneval] | 0-shot | 36.0 | 45.7 | 48.8 | [mmlu]: [agieval]: [math]: [gsm8k]: [gpqa]: [mbpp]: [humaneval]: #### Multilingual | Benchmark | Gemma 3 PT 1B | Gemma 3 PT 4B | Gemma 3 PT 12B | Gemma 3 PT 27B | | ------------------------------------ |:-------------:|:-------------:|:--------------:|:--------------:| | [MGSM][mgsm] | 2.04 | 34.7 | 64.3 | 74.3 | | [Global-MMLU-Lite][global-mmlu-lite] | 24.9 | 57.0 | 69.4 | 75.7 | | [WMT24++][wmt24pp] (ChrF) | 36.7 | 48.4 | 53.9 | 55.7 | | [FloRes][flores] | 29.5 | 39.2 | 46.0 | 48.8 | | [XQuAD][xquad] (all) | 43.9 | 68.0 | 74.5 | 76.8 | | [ECLeKTic][eclektic] | 4.69 | 11.0 | 17.2 | 24.4 | | [IndicGenBench][indicgenbench] | 41.4 | 57.2 | 61.7 | 63.4 | [mgsm]: [flores]: [xquad]: [global-mmlu-lite]: [wmt24pp]: [eclektic]: [indicgenbench]: #### Multimodal | Benchmark | Gemma 3 PT 4B | Gemma 3 PT 12B | Gemma 3 PT 27B | | ------------------------------ |:-------------:|:--------------:|:--------------:| | [COCOcap][coco-cap] | 102 | 111 | 116 | | [DocVQA][docvqa] (val) | 72.8 | 82.3 | 85.6 | | [InfoVQA][info-vqa] (val) | 44.1 | 54.8 | 59.4 | | [MMMU][mmmu] (pt) | 39.2 | 50.3 | 56.1 | | [TextVQA][textvqa] (val) | 58.9 | 66.5 | 68.6 | | [RealWorldQA][realworldqa] | 45.5 | 52.2 | 53.9 | | [ReMI][remi] | 27.3 | 38.5 | 44.8 | | [AI2D][ai2d] | 63.2 | 75.2 | 79.0 | | [ChartQA][chartqa] | 63.6 | 74.7 | 76.3 | | [VQAv2][vqav2] | 63.9 | 71.2 | 72.9 | | [BLINK][blinkvqa] | 38.0 | 35.9 | 39.6 | | [OKVQA][okvqa] | 51.0 | 58.7 | 60.2 | | [TallyQA][tallyqa] | 42.5 | 51.8 | 54.3 | | [SpatialSense VQA][ss-vqa] | 50.9 | 60.0 | 59.4 | | [CountBenchQA][countbenchqa] | 26.1 | 17.8 | 68.0 | [coco-cap]: [docvqa]: [info-vqa]: [mmmu]: [textvqa]: [realworldqa]: [remi]: [ai2d]: [chartqa]: [vqav2]: [blinkvqa]: [okvqa]: [tallyqa]: [ss-vqa]: [countbenchqa]: ## Ethics and Safety Ethics and safety evaluation approach and results. ### Evaluation Approach Our evaluation methods include structured evaluations and internal red-teaming testing of relevant content policies. Red-teaming was conducted by a number of different teams, each with different goals and human evaluation metrics. These models were evaluated against a number of different categories relevant to ethics and safety, including: - **Child Safety**: Evaluation of text-to-text and image to text prompts covering child safety policies, including child sexual abuse and exploitation. - **Content Safety:** Evaluation of text-to-text and image to text prompts covering safety policies including, harassment, violence and gore, and hate speech. - **Representational Harms**: Evaluation of text-to-text and image to text prompts covering safety policies including bias, stereotyping, and harmful associations or inaccuracies. In addition to development level evaluations, we conduct \"assurance evaluations\" which are our 'arms-length' internal evaluations for responsibility governance decision making. They are conducted separately from the model development team, to inform decision making about release. High level findings are fed back to the model team, but prompt sets are held-out to prevent overfitting and preserve the results' ability to inform decision making. Assurance evaluation results are reported to our Responsibility & Safety Council as part of release review. ### Evaluation Results For all areas of safety testing, we saw major improvements in the categories of child safety, content safety, and representational harms relative to previous Gemma models. All testing was conducted without safety filters to evaluate the model capabilities and behaviors. For both text-to-text and image-to-text, and across all model sizes, the model produced minimal policy violations, and showed significant improvements over previous Gemma models' performance with respect to ungrounded inferences. A limitation of our evaluations was they included only English language prompts. ## Usage and Limitations These models have certain limitations that users should be aware of. ### Intended Usage Open vision-language models (VLMs) models have a wide range of applications across various industries and domains. The following list of potential uses is not comprehensive. The purpose of this list is to provide contextual information about the possible use-cases that the model creators considered as part of model training and development. - Content Creation and Communication - Text Generation: These models can be used to generate creative text formats such as poems, scripts, code, marketing copy, and email drafts. - Chatbots and Conversational AI: Power conversational interfaces for customer service, virtual assistants, or interactive applications. - Text Summarization: Generate concise summaries of a text corpus, research papers, or reports. - Image Data Extraction: These models can be used to extract, interpret, and summarize visual data for text communications. - Research and Education - Natural Language Processing (NLP) and VLM Research: These models can serve as a foundation for researchers to experiment with VLM and NLP techniques, develop algorithms, and contribute to the advancement of the field. - Language Learning Tools: Support interactive language learning experiences, aiding in grammar correction or providing writing practice. - Knowledge Exploration: Assist researchers in exploring large bodies of text by generating summaries or answering questions about specific topics. ### Limitations - Training Data - The quality and diversity of the training data significantly influence the model's capabilities. Biases or gaps in the training data can lead to limitations in the model's responses. - The scope of the training dataset determines the subject areas the model can handle effectively. - Context and Task Complexity - Models are better at tasks that can be framed with clear prompts and instructions. Open-ended or highly complex tasks might be challenging. - A model's performance can be influenced by the amount of context provided (longer context generally leads to better outputs, up to a certain point). - Language Ambiguity and Nuance - Natural language is inherently complex. Models might struggle to grasp subtle nuances, sarcasm, or figurative language. - Factual Accuracy - Models generate responses based on information they learned from their training datasets, but they are not knowledge bases. They may generate incorrect or outdated factual statements. - Common Sense - Models rely on statistical patterns in language. They might lack the ability to apply common sense reasoning in certain situations. ### Ethical Considerations and Risks The development of vision-language models (VLMs) raises several ethical concerns. In creating an open model, we have carefully considered the following: - Bias and Fairness - VLMs trained on large-scale, real-world text and image data can reflect socio-cultural biases embedded in the training material. These models underwent careful scrutiny, input data pre-processing described and posterior evaluations reported in this card. - Misinformation and Misuse - VLMs can be misused to generate text that is false, misleading, or harmful. - Guidelines are provided for responsible use with the model, see the [Responsible Generative AI Toolkit][rai-toolkit]. - Transparency and Accountability: - This model card summarizes details on the models' architecture, capabilities, limitations, and evaluation processes. - A responsibly developed open model offers the opportunity to share innovation by making VLM technology accessible to developers and researchers across the AI ecosystem. Risks identified and mitigations: - **Perpetuation of biases**: It's encouraged to perform continuous monitoring (using evaluation metrics, human review) and the exploration of de-biasing techniques during model training, fine-tuning, and other use cases. - **Generation of harmful content**: Mechanisms and guidelines for content safety are essential. Developers are encouraged to exercise caution and implement appropriate content safety safeguards based on their specific product policies and application use cases. - **Misuse for malicious purposes**: Technical limitations and developer and end-user education can help mitigate against malicious applications of VLMs. Educational resources and reporting mechanisms for users to flag misuse are provided. Prohibited uses of Gemma models are outlined in the [Gemma Prohibited Use Policy][prohibited-use]. - **Privacy violations**: Models were trained on data filtered for removal of certain personal information and other sensitive data. Developers are encouraged to adhere to privacy regulations with privacy-preserving techniques. ### Benefits At the time of release, this family of models provides high-performance open vision-language model implementations designed from the ground up for responsible AI development compared to similarly sized models. Using the benchmark evaluation metrics described in this document, these models have shown to provide superior performance to other, comparably-sized open model alternatives. [g3-tech-report]: [rai-toolkit]: [kaggle-gemma]: [vertex-mg-gemma3]: [terms]: [safety-policies]: [prohibited-use]: [tpu]: [sustainability]: [jax]: [ml-pathways]: [sustainability]: [gemini-2-paper]:",
+ "model_explanation_gemini": "\"google_gemma-3-1b-it\" is a lightweight, multimodal AI model from Google's Gemma 3 family, designed for text generation and image understanding tasks like question answering and summarization, with support for 140+ languages and a 128K token context window.\n\nFeatures: \n- Multimodal (text and image inputs, text output) \n- 128K token context window \n- Supports 140+ languages \n- Instruction-tuned variant \n- Optimized"
+}
\ No newline at end of file
diff --git a/model_data_json/google_gemma-3-1b-pt.json b/model_data_json/google_gemma-3-1b-pt.json
new file mode 100644
index 0000000000000000000000000000000000000000..ac5b083abe31d784c91027fbc3181f725bd0a8f0
--- /dev/null
+++ b/model_data_json/google_gemma-3-1b-pt.json
@@ -0,0 +1,44 @@
+{
+ "model_id": "google/gemma-3-1b-pt",
+ "downloads": 154238,
+ "tags": [
+ "transformers",
+ "safetensors",
+ "gemma3_text",
+ "text-generation",
+ "arxiv:1905.07830",
+ "arxiv:1905.10044",
+ "arxiv:1911.11641",
+ "arxiv:1904.09728",
+ "arxiv:1705.03551",
+ "arxiv:1911.01547",
+ "arxiv:1907.10641",
+ "arxiv:1903.00161",
+ "arxiv:2009.03300",
+ "arxiv:2304.06364",
+ "arxiv:2103.03874",
+ "arxiv:2110.14168",
+ "arxiv:2311.12022",
+ "arxiv:2108.07732",
+ "arxiv:2107.03374",
+ "arxiv:2210.03057",
+ "arxiv:2106.03193",
+ "arxiv:1910.11856",
+ "arxiv:2502.12404",
+ "arxiv:2502.21228",
+ "arxiv:2404.16816",
+ "arxiv:2104.12756",
+ "arxiv:2311.16502",
+ "arxiv:2203.10244",
+ "arxiv:2404.12390",
+ "arxiv:1810.12440",
+ "arxiv:1908.02660",
+ "arxiv:2312.11805",
+ "license:gemma",
+ "autotrain_compatible",
+ "endpoints_compatible",
+ "region:us"
+ ],
+ "description": "--- license: gemma library_name: transformers pipeline_tag: text-generation extra_gated_heading: Access Gemma on Hugging Face extra_gated_prompt: >- To access Gemma on Hugging Face, you’re required to review and agree to Google’s usage license. To do this, please ensure you’re logged in to Hugging Face and click below. Requests are processed immediately. extra_gated_button_content: Acknowledge license --- # Gemma 3 model card **Model Page**: Gemma **Resources and Technical Documentation**: * [Gemma 3 Technical Report][g3-tech-report] * [Responsible Generative AI Toolkit][rai-toolkit] * [Gemma on Kaggle][kaggle-gemma] * [Gemma on Vertex Model Garden][vertex-mg-gemma3] **Terms of Use**: [Terms][terms] **Authors**: Google DeepMind ## Model Information Summary description and brief definition of inputs and outputs. ### Description Gemma is a family of lightweight, state-of-the-art open models from Google, built from the same research and technology used to create the Gemini models. Gemma 3 models are multimodal, handling text and image input and generating text output, with open weights for both pre-trained variants and instruction-tuned variants. Gemma 3 has a large, 128K context window, multilingual support in over 140 languages, and is available in more sizes than previous versions. Gemma 3 models are well-suited for a variety of text generation and image understanding tasks, including question answering, summarization, and reasoning. Their relatively small size makes it possible to deploy them in environments with limited resources such as laptops, desktops or your own cloud infrastructure, democratizing access to state of the art AI models and helping foster innovation for everyone. ### Usage Below, there are some code snippets on how to get quickly started with running the model. First, install the Transformers library. Gemma 3 is supported starting from transformers 4.50.0. Then, copy the snippet from the section that is relevant for your use case. #### Running with the API #### Running the model on a single / multi GPU ### Inputs and outputs - **Input:** - Text string, such as a question, a prompt, or a document to be summarized - Images, normalized to 896 x 896 resolution and encoded to 256 tokens each - Total input context of 128K tokens for the 4B, 12B, and 27B sizes, and 32K tokens for the 1B size - **Output:** - Generated text in response to the input, such as an answer to a question, analysis of image content, or a summary of a document - Total output context of 8192 tokens ### Citation ## Model Data Data used for model training and how the data was processed. ### Training Dataset These models were trained on a dataset of text data that includes a wide variety of sources. The 27B model was trained with 14 trillion tokens, the 12B model was trained with 12 trillion tokens, 4B model was trained with 4 trillion tokens and 1B with 2 trillion tokens. Here are the key components: - Web Documents: A diverse collection of web text ensures the model is exposed to a broad range of linguistic styles, topics, and vocabulary. The training dataset includes content in over 140 languages. - Code: Exposing the model to code helps it to learn the syntax and patterns of programming languages, which improves its ability to generate code and understand code-related questions. - Mathematics: Training on mathematical text helps the model learn logical reasoning, symbolic representation, and to address mathematical queries. - Images: A wide range of images enables the model to perform image analysis and visual data extraction tasks. The combination of these diverse data sources is crucial for training a powerful multimodal model that can handle a wide variety of different tasks and data formats. ### Data Preprocessing Here are the key data cleaning and filtering methods applied to the training data: - CSAM Filtering: Rigorous CSAM (Child Sexual Abuse Material) filtering was applied at multiple stages in the data preparation process to ensure the exclusion of harmful and illegal content. - Sensitive Data Filtering: As part of making Gemma pre-trained models safe and reliable, automated techniques were used to filter out certain personal information and other sensitive data from training sets. - Additional methods: Filtering based on content quality and safety in line with [our policies][safety-policies]. ## Implementation Information Details about the model internals. ### Hardware Gemma was trained using [Tensor Processing Unit (TPU)][tpu] hardware (TPUv4p, TPUv5p and TPUv5e). Training vision-language models (VLMS) requires significant computational power. TPUs, designed specifically for matrix operations common in machine learning, offer several advantages in this domain: - Performance: TPUs are specifically designed to handle the massive computations involved in training VLMs. They can speed up training considerably compared to CPUs. - Memory: TPUs often come with large amounts of high-bandwidth memory, allowing for the handling of large models and batch sizes during training. This can lead to better model quality. - Scalability: TPU Pods (large clusters of TPUs) provide a scalable solution for handling the growing complexity of large foundation models. You can distribute training across multiple TPU devices for faster and more efficient processing. - Cost-effectiveness: In many scenarios, TPUs can provide a more cost-effective solution for training large models compared to CPU-based infrastructure, especially when considering the time and resources saved due to faster training. - These advantages are aligned with [Google's commitments to operate sustainably][sustainability]. ### Software Training was done using [JAX][jax] and [ML Pathways][ml-pathways]. JAX allows researchers to take advantage of the latest generation of hardware, including TPUs, for faster and more efficient training of large models. ML Pathways is Google's latest effort to build artificially intelligent systems capable of generalizing across multiple tasks. This is specially suitable for foundation models, including large language models like these ones. Together, JAX and ML Pathways are used as described in the [paper about the Gemini family of models][gemini-2-paper]; *\"the 'single controller' programming model of Jax and Pathways allows a single Python process to orchestrate the entire training run, dramatically simplifying the development workflow.\"* ## Evaluation Model evaluation metrics and results. ### Benchmark Results These models were evaluated against a large collection of different datasets and metrics to cover different aspects of text generation: #### Reasoning and factuality | Benchmark | Metric | Gemma 3 PT 1B | Gemma 3 PT 4B | Gemma 3 PT 12B | Gemma 3 PT 27B | | ------------------------------ |----------------|:--------------:|:-------------:|:--------------:|:--------------:| | [HellaSwag][hellaswag] | 10-shot | 62.3 | 77.2 | 84.2 | 85.6 | | [BoolQ][boolq] | 0-shot | 63.2 | 72.3 | 78.8 | 82.4 | | [PIQA][piqa] | 0-shot | 73.8 | 79.6 | 81.8 | 83.3 | | [SocialIQA][socialiqa] | 0-shot | 48.9 | 51.9 | 53.4 | 54.9 | | [TriviaQA][triviaqa] | 5-shot | 39.8 | 65.8 | 78.2 | 85.5 | | [Natural Questions][naturalq] | 5-shot | 9.48 | 20.0 | 31.4 | 36.1 | | [ARC-c][arc] | 25-shot | 38.4 | 56.2 | 68.9 | 70.6 | | [ARC-e][arc] | 0-shot | 73.0 | 82.4 | 88.3 | 89.0 | | [WinoGrande][winogrande] | 5-shot | 58.2 | 64.7 | 74.3 | 78.8 | | [BIG-Bench Hard][bbh] | few-shot | 28.4 | 50.9 | 72.6 | 77.7 | | [DROP][drop] | 1-shot | 42.4 | 60.1 | 72.2 | 77.2 | [hellaswag]: [boolq]: [piqa]: [socialiqa]: [triviaqa]: [naturalq]: [arc]: [winogrande]: [bbh]: [drop]: #### STEM and code | Benchmark | Metric | Gemma 3 PT 4B | Gemma 3 PT 12B | Gemma 3 PT 27B | | ------------------------------ |----------------|:-------------:|:--------------:|:--------------:| | [MMLU][mmlu] | 5-shot | 59.6 | 74.5 | 78.6 | | [MMLU][mmlu] (Pro COT) | 5-shot | 29.2 | 45.3 | 52.2 | | [AGIEval][agieval] | 3-5-shot | 42.1 | 57.4 | 66.2 | | [MATH][math] | 4-shot | 24.2 | 43.3 | 50.0 | | [GSM8K][gsm8k] | 8-shot | 38.4 | 71.0 | 82.6 | | [GPQA][gpqa] | 5-shot | 15.0 | 25.4 | 24.3 | | [MBPP][mbpp] | 3-shot | 46.0 | 60.4 | 65.6 | | [HumanEval][humaneval] | 0-shot | 36.0 | 45.7 | 48.8 | [mmlu]: [agieval]: [math]: [gsm8k]: [gpqa]: [mbpp]: [humaneval]: #### Multilingual | Benchmark | Gemma 3 PT 1B | Gemma 3 PT 4B | Gemma 3 PT 12B | Gemma 3 PT 27B | | ------------------------------------ |:-------------:|:-------------:|:--------------:|:--------------:| | [MGSM][mgsm] | 2.04 | 34.7 | 64.3 | 74.3 | | [Global-MMLU-Lite][global-mmlu-lite] | 24.9 | 57.0 | 69.4 | 75.7 | | [WMT24++][wmt24pp] (ChrF) | 36.7 | 48.4 | 53.9 | 55.7 | | [FloRes][flores] | 29.5 | 39.2 | 46.0 | 48.8 | | [XQuAD][xquad] (all) | 43.9 | 68.0 | 74.5 | 76.8 | | [ECLeKTic][eclektic] | 4.69 | 11.0 | 17.2 | 24.4 | | [IndicGenBench][indicgenbench] | 41.4 | 57.2 | 61.7 | 63.4 | [mgsm]: [flores]: [xquad]: [global-mmlu-lite]: [wmt24pp]: [eclektic]: [indicgenbench]: #### Multimodal | Benchmark | Gemma 3 PT 4B | Gemma 3 PT 12B | Gemma 3 PT 27B | | ------------------------------ |:-------------:|:--------------:|:--------------:| | [COCOcap][coco-cap] | 102 | 111 | 116 | | [DocVQA][docvqa] (val) | 72.8 | 82.3 | 85.6 | | [InfoVQA][info-vqa] (val) | 44.1 | 54.8 | 59.4 | | [MMMU][mmmu] (pt) | 39.2 | 50.3 | 56.1 | | [TextVQA][textvqa] (val) | 58.9 | 66.5 | 68.6 | | [RealWorldQA][realworldqa] | 45.5 | 52.2 | 53.9 | | [ReMI][remi] | 27.3 | 38.5 | 44.8 | | [AI2D][ai2d] | 63.2 | 75.2 | 79.0 | | [ChartQA][chartqa] | 63.6 | 74.7 | 76.3 | | [VQAv2][vqav2] | 63.9 | 71.2 | 72.9 | | [BLINK][blinkvqa] | 38.0 | 35.9 | 39.6 | | [OKVQA][okvqa] | 51.0 | 58.7 | 60.2 | | [TallyQA][tallyqa] | 42.5 | 51.8 | 54.3 | | [SpatialSense VQA][ss-vqa] | 50.9 | 60.0 | 59.4 | | [CountBenchQA][countbenchqa] | 26.1 | 17.8 | 68.0 | [coco-cap]: [docvqa]: [info-vqa]: [mmmu]: [textvqa]: [realworldqa]: [remi]: [ai2d]: [chartqa]: [vqav2]: [blinkvqa]: [okvqa]: [tallyqa]: [ss-vqa]: [countbenchqa]: ## Ethics and Safety Ethics and safety evaluation approach and results. ### Evaluation Approach Our evaluation methods include structured evaluations and internal red-teaming testing of relevant content policies. Red-teaming was conducted by a number of different teams, each with different goals and human evaluation metrics. These models were evaluated against a number of different categories relevant to ethics and safety, including: - **Child Safety**: Evaluation of text-to-text and image to text prompts covering child safety policies, including child sexual abuse and exploitation. - **Content Safety:** Evaluation of text-to-text and image to text prompts covering safety policies including, harassment, violence and gore, and hate speech. - **Representational Harms**: Evaluation of text-to-text and image to text prompts covering safety policies including bias, stereotyping, and harmful associations or inaccuracies. In addition to development level evaluations, we conduct \"assurance evaluations\" which are our 'arms-length' internal evaluations for responsibility governance decision making. They are conducted separately from the model development team, to inform decision making about release. High level findings are fed back to the model team, but prompt sets are held-out to prevent overfitting and preserve the results' ability to inform decision making. Assurance evaluation results are reported to our Responsibility & Safety Council as part of release review. ### Evaluation Results For all areas of safety testing, we saw major improvements in the categories of child safety, content safety, and representational harms relative to previous Gemma models. All testing was conducted without safety filters to evaluate the model capabilities and behaviors. For both text-to-text and image-to-text, and across all model sizes, the model produced minimal policy violations, and showed significant improvements over previous Gemma models' performance with respect to ungrounded inferences. A limitation of our evaluations was they included only English language prompts. ## Usage and Limitations These models have certain limitations that users should be aware of. ### Intended Usage Open vision-language models (VLMs) models have a wide range of applications across various industries and domains. The following list of potential uses is not comprehensive. The purpose of this list is to provide contextual information about the possible use-cases that the model creators considered as part of model training and development. - Content Creation and Communication - Text Generation: These models can be used to generate creative text formats such as poems, scripts, code, marketing copy, and email drafts. - Chatbots and Conversational AI: Power conversational interfaces for customer service, virtual assistants, or interactive applications. - Text Summarization: Generate concise summaries of a text corpus, research papers, or reports. - Image Data Extraction: These models can be used to extract, interpret, and summarize visual data for text communications. - Research and Education - Natural Language Processing (NLP) and VLM Research: These models can serve as a foundation for researchers to experiment with VLM and NLP techniques, develop algorithms, and contribute to the advancement of the field. - Language Learning Tools: Support interactive language learning experiences, aiding in grammar correction or providing writing practice. - Knowledge Exploration: Assist researchers in exploring large bodies of text by generating summaries or answering questions about specific topics. ### Limitations - Training Data - The quality and diversity of the training data significantly influence the model's capabilities. Biases or gaps in the training data can lead to limitations in the model's responses. - The scope of the training dataset determines the subject areas the model can handle effectively. - Context and Task Complexity - Models are better at tasks that can be framed with clear prompts and instructions. Open-ended or highly complex tasks might be challenging. - A model's performance can be influenced by the amount of context provided (longer context generally leads to better outputs, up to a certain point). - Language Ambiguity and Nuance - Natural language is inherently complex. Models might struggle to grasp subtle nuances, sarcasm, or figurative language. - Factual Accuracy - Models generate responses based on information they learned from their training datasets, but they are not knowledge bases. They may generate incorrect or outdated factual statements. - Common Sense - Models rely on statistical patterns in language. They might lack the ability to apply common sense reasoning in certain situations. ### Ethical Considerations and Risks The development of vision-language models (VLMs) raises several ethical concerns. In creating an open model, we have carefully considered the following: - Bias and Fairness - VLMs trained on large-scale, real-world text and image data can reflect socio-cultural biases embedded in the training material. These models underwent careful scrutiny, input data pre-processing described and posterior evaluations reported in this card. - Misinformation and Misuse - VLMs can be misused to generate text that is false, misleading, or harmful. - Guidelines are provided for responsible use with the model, see the [Responsible Generative AI Toolkit][rai-toolkit]. - Transparency and Accountability: - This model card summarizes details on the models' architecture, capabilities, limitations, and evaluation processes. - A responsibly developed open model offers the opportunity to share innovation by making VLM technology accessible to developers and researchers across the AI ecosystem. Risks identified and mitigations: - **Perpetuation of biases**: It's encouraged to perform continuous monitoring (using evaluation metrics, human review) and the exploration of de-biasing techniques during model training, fine-tuning, and other use cases. - **Generation of harmful content**: Mechanisms and guidelines for content safety are essential. Developers are encouraged to exercise caution and implement appropriate content safety safeguards based on their specific product policies and application use cases. - **Misuse for malicious purposes**: Technical limitations and developer and end-user education can help mitigate against malicious applications of VLMs. Educational resources and reporting mechanisms for users to flag misuse are provided. Prohibited uses of Gemma models are outlined in the [Gemma Prohibited Use Policy][prohibited-use]. - **Privacy violations**: Models were trained on data filtered for removal of certain personal information and other sensitive data. Developers are encouraged to adhere to privacy regulations with privacy-preserving techniques. ### Benefits At the time of release, this family of models provides high-performance open vision-language model implementations designed from the ground up for responsible AI development compared to similarly sized models. Using the benchmark evaluation metrics described in this document, these models have shown to provide superior performance to other, comparably-sized open model alternatives. [g3-tech-report]: [rai-toolkit]: [kaggle-gemma]: [vertex-mg-gemma3]: [terms]: [safety-policies]: [prohibited-use]: [tpu]: [sustainability]: [jax]: [ml-pathways]: [sustainability]: [gemini-2-paper]:",
+ "model_explanation_gemini": "Generates text and analyzes images with a 128K context window, supporting multilingual tasks like question answering and summarization in resource-limited environments.\n\nModel Features: \n- Multimodal (text and image input, text output) \n- 128K context window \n- Supports over 140 languages \n- Available in multiple sizes (1B, 4B, 12B, 27B parameters) \n- Optimized for deployment on limited-resource devices \n\nComparison: \nLarger than"
+}
\ No newline at end of file
diff --git a/model_data_json/google_gemma-3-27b-it-qat-q4_0-gguf.json b/model_data_json/google_gemma-3-27b-it-qat-q4_0-gguf.json
new file mode 100644
index 0000000000000000000000000000000000000000..4e9eb348c2609be4dd814b708796bf3128df1a68
--- /dev/null
+++ b/model_data_json/google_gemma-3-27b-it-qat-q4_0-gguf.json
@@ -0,0 +1,46 @@
+{
+ "model_id": "google/gemma-3-27b-it-qat-q4_0-gguf",
+ "downloads": 69446,
+ "tags": [
+ "gguf",
+ "gemma",
+ "gemma3",
+ "image-text-to-text",
+ "arxiv:1905.07830",
+ "arxiv:1905.10044",
+ "arxiv:1911.11641",
+ "arxiv:1904.09728",
+ "arxiv:1705.03551",
+ "arxiv:1911.01547",
+ "arxiv:1907.10641",
+ "arxiv:1903.00161",
+ "arxiv:2009.03300",
+ "arxiv:2304.06364",
+ "arxiv:2103.03874",
+ "arxiv:2110.14168",
+ "arxiv:2311.12022",
+ "arxiv:2108.07732",
+ "arxiv:2107.03374",
+ "arxiv:2210.03057",
+ "arxiv:2106.03193",
+ "arxiv:1910.11856",
+ "arxiv:2502.12404",
+ "arxiv:2502.21228",
+ "arxiv:2404.16816",
+ "arxiv:2104.12756",
+ "arxiv:2311.16502",
+ "arxiv:2203.10244",
+ "arxiv:2404.12390",
+ "arxiv:1810.12440",
+ "arxiv:1908.02660",
+ "arxiv:2312.11805",
+ "base_model:google/gemma-3-27b-it",
+ "base_model:quantized:google/gemma-3-27b-it",
+ "license:gemma",
+ "endpoints_compatible",
+ "region:us",
+ "conversational"
+ ],
+ "description": "--- license: gemma pipeline_tag: image-text-to-text extra_gated_heading: Access Gemma on Hugging Face extra_gated_prompt: >- To access Gemma on Hugging Face, you’re required to review and agree to Google’s usage license. To do this, please ensure you’re logged in to Hugging Face and click below. Requests are processed immediately. extra_gated_button_content: Acknowledge license base_model: google/gemma-3-27b-it tags: - gemma - gemma3 --- # Gemma 3 model card **Model Page**: Gemma > [!Note] > This repository corresponds to the 27 **instruction-tuned** version of the Gemma 3 model in GGUF format using Quantization Aware Training (QAT). > The GGUF corresponds to Q4_0 quantization. > > Thanks to QAT, the model is able to preserve similar quality as while significantly reducing the memory requirements > to load the model. > > You can find the half-precision version here. **Resources and Technical Documentation**: * [Gemma 3 Technical Report][g3-tech-report] * [Responsible Generative AI Toolkit][rai-toolkit] * [Gemma on Kaggle][kaggle-gemma] * [Gemma on Vertex Model Garden][vertex-mg-gemma3] **Terms of Use**: [Terms][terms] **Authors**: Google DeepMind ## Model Information Summary description and brief definition of inputs and outputs. ### Description Gemma is a family of lightweight, state-of-the-art open models from Google, built from the same research and technology used to create the Gemini models. Gemma 3 models are multimodal, handling text and image input and generating text output, with open weights for both pre-trained variants and instruction-tuned variants. Gemma 3 has a large, 128K context window, multilingual support in over 140 languages, and is available in more sizes than previous versions. Gemma 3 models are well-suited for a variety of text generation and image understanding tasks, including question answering, summarization, and reasoning. Their relatively small size makes it possible to deploy them in environments with limited resources such as laptops, desktops or your own cloud infrastructure, democratizing access to state of the art AI models and helping foster innovation for everyone. ### Inputs and outputs - **Input:** - Text string, such as a question, a prompt, or a document to be summarized - Images, normalized to 896 x 896 resolution and encoded to 256 tokens each - Total input context of 128K tokens for the 4B, 12B, and 27B sizes, and 32K tokens for the 1B size - **Output:** - Generated text in response to the input, such as an answer to a question, analysis of image content, or a summary of a document - Total output context of 8192 tokens ### Usage Below, there are some code snippets on how to get quickly started with running the model. **llama.cpp (text-only)** **llama.cpp (image input)** **ollama (text only)** Using GGUFs with Ollama via Hugging Face does not support image inputs at the moment. Please check the docs on running gated repositories. ### Citation ## Model Data Data used for model training and how the data was processed. ### Training Dataset These models were trained on a dataset of text data that includes a wide variety of sources. The 27B model was trained with 14 trillion tokens, the 12B model was trained with 12 trillion tokens, 4B model was trained with 4 trillion tokens and 1B with 2 trillion tokens. Here are the key components: - Web Documents: A diverse collection of web text ensures the model is exposed to a broad range of linguistic styles, topics, and vocabulary. The training dataset includes content in over 140 languages. - Code: Exposing the model to code helps it to learn the syntax and patterns of programming languages, which improves its ability to generate code and understand code-related questions. - Mathematics: Training on mathematical text helps the model learn logical reasoning, symbolic representation, and to address mathematical queries. - Images: A wide range of images enables the model to perform image analysis and visual data extraction tasks. The combination of these diverse data sources is crucial for training a powerful multimodal model that can handle a wide variety of different tasks and data formats. ### Data Preprocessing Here are the key data cleaning and filtering methods applied to the training data: - CSAM Filtering: Rigorous CSAM (Child Sexual Abuse Material) filtering was applied at multiple stages in the data preparation process to ensure the exclusion of harmful and illegal content. - Sensitive Data Filtering: As part of making Gemma pre-trained models safe and reliable, automated techniques were used to filter out certain personal information and other sensitive data from training sets. - Additional methods: Filtering based on content quality and safety in line with [our policies][safety-policies]. ## Implementation Information Details about the model internals. ### Hardware Gemma was trained using [Tensor Processing Unit (TPU)][tpu] hardware (TPUv4p, TPUv5p and TPUv5e). Training vision-language models (VLMS) requires significant computational power. TPUs, designed specifically for matrix operations common in machine learning, offer several advantages in this domain: - Performance: TPUs are specifically designed to handle the massive computations involved in training VLMs. They can speed up training considerably compared to CPUs. - Memory: TPUs often come with large amounts of high-bandwidth memory, allowing for the handling of large models and batch sizes during training. This can lead to better model quality. - Scalability: TPU Pods (large clusters of TPUs) provide a scalable solution for handling the growing complexity of large foundation models. You can distribute training across multiple TPU devices for faster and more efficient processing. - Cost-effectiveness: In many scenarios, TPUs can provide a more cost-effective solution for training large models compared to CPU-based infrastructure, especially when considering the time and resources saved due to faster training. - These advantages are aligned with [Google's commitments to operate sustainably][sustainability]. ### Software Training was done using [JAX][jax] and [ML Pathways][ml-pathways]. JAX allows researchers to take advantage of the latest generation of hardware, including TPUs, for faster and more efficient training of large models. ML Pathways is Google's latest effort to build artificially intelligent systems capable of generalizing across multiple tasks. This is specially suitable for foundation models, including large language models like these ones. Together, JAX and ML Pathways are used as described in the [paper about the Gemini family of models][gemini-2-paper]; *\"the 'single controller' programming model of Jax and Pathways allows a single Python process to orchestrate the entire training run, dramatically simplifying the development workflow.\"* ## Evaluation Model evaluation metrics and results. ### Benchmark Results These models were evaluated against a large collection of different datasets and metrics to cover different aspects of text generation: #### Reasoning and factuality | Benchmark | Metric | Gemma 3 PT 1B | Gemma 3 PT 4B | Gemma 3 PT 12B | Gemma 3 PT 27B | | ------------------------------ |----------------|:--------------:|:-------------:|:--------------:|:--------------:| | [HellaSwag][hellaswag] | 10-shot | 62.3 | 77.2 | 84.2 | 85.6 | | [BoolQ][boolq] | 0-shot | 63.2 | 72.3 | 78.8 | 82.4 | | [PIQA][piqa] | 0-shot | 73.8 | 79.6 | 81.8 | 83.3 | | [SocialIQA][socialiqa] | 0-shot | 48.9 | 51.9 | 53.4 | 54.9 | | [TriviaQA][triviaqa] | 5-shot | 39.8 | 65.8 | 78.2 | 85.5 | | [Natural Questions][naturalq] | 5-shot | 9.48 | 20.0 | 31.4 | 36.1 | | [ARC-c][arc] | 25-shot | 38.4 | 56.2 | 68.9 | 70.6 | | [ARC-e][arc] | 0-shot | 73.0 | 82.4 | 88.3 | 89.0 | | [WinoGrande][winogrande] | 5-shot | 58.2 | 64.7 | 74.3 | 78.8 | | [BIG-Bench Hard][bbh] | few-shot | 28.4 | 50.9 | 72.6 | 77.7 | | [DROP][drop] | 1-shot | 42.4 | 60.1 | 72.2 | 77.2 | [hellaswag]: [boolq]: [piqa]: [socialiqa]: [triviaqa]: [naturalq]: [arc]: [winogrande]: [bbh]: [drop]: #### STEM and code | Benchmark | Metric | Gemma 3 PT 4B | Gemma 3 PT 12B | Gemma 3 PT 27B | | ------------------------------ |----------------|:-------------:|:--------------:|:--------------:| | [MMLU][mmlu] | 5-shot | 59.6 | 74.5 | 78.6 | | [MMLU][mmlu] (Pro COT) | 5-shot | 29.2 | 45.3 | 52.2 | | [AGIEval][agieval] | 3-5-shot | 42.1 | 57.4 | 66.2 | | [MATH][math] | 4-shot | 24.2 | 43.3 | 50.0 | | [GSM8K][gsm8k] | 8-shot | 38.4 | 71.0 | 82.6 | | [GPQA][gpqa] | 5-shot | 15.0 | 25.4 | 24.3 | | [MBPP][mbpp] | 3-shot | 46.0 | 60.4 | 65.6 | | [HumanEval][humaneval] | 0-shot | 36.0 | 45.7 | 48.8 | [mmlu]: [agieval]: [math]: [gsm8k]: [gpqa]: [mbpp]: [humaneval]: #### Multilingual | Benchmark | Gemma 3 PT 1B | Gemma 3 PT 4B | Gemma 3 PT 12B | Gemma 3 PT 27B | | ------------------------------------ |:-------------:|:-------------:|:--------------:|:--------------:| | [MGSM][mgsm] | 2.04 | 34.7 | 64.3 | 74.3 | | [Global-MMLU-Lite][global-mmlu-lite] | 24.9 | 57.0 | 69.4 | 75.7 | | [WMT24++][wmt24pp] (ChrF) | 36.7 | 48.4 | 53.9 | 55.7 | | [FloRes][flores] | 29.5 | 39.2 | 46.0 | 48.8 | | [XQuAD][xquad] (all) | 43.9 | 68.0 | 74.5 | 76.8 | | [ECLeKTic][eclektic] | 4.69 | 11.0 | 17.2 | 24.4 | | [IndicGenBench][indicgenbench] | 41.4 | 57.2 | 61.7 | 63.4 | [mgsm]: [flores]: [xquad]: [global-mmlu-lite]: [wmt24pp]: [eclektic]: [indicgenbench]: #### Multimodal | Benchmark | Gemma 3 PT 4B | Gemma 3 PT 12B | Gemma 3 PT 27B | | ------------------------------ |:-------------:|:--------------:|:--------------:| | [COCOcap][coco-cap] | 102 | 111 | 116 | | [DocVQA][docvqa] (val) | 72.8 | 82.3 | 85.6 | | [InfoVQA][info-vqa] (val) | 44.1 | 54.8 | 59.4 | | [MMMU][mmmu] (pt) | 39.2 | 50.3 | 56.1 | | [TextVQA][textvqa] (val) | 58.9 | 66.5 | 68.6 | | [RealWorldQA][realworldqa] | 45.5 | 52.2 | 53.9 | | [ReMI][remi] | 27.3 | 38.5 | 44.8 | | [AI2D][ai2d] | 63.2 | 75.2 | 79.0 | | [ChartQA][chartqa] | 63.6 | 74.7 | 76.3 | | [VQAv2][vqav2] | 63.9 | 71.2 | 72.9 | | [BLINK][blinkvqa] | 38.0 | 35.9 | 39.6 | | [OKVQA][okvqa] | 51.0 | 58.7 | 60.2 | | [TallyQA][tallyqa] | 42.5 | 51.8 | 54.3 | | [SpatialSense VQA][ss-vqa] | 50.9 | 60.0 | 59.4 | | [CountBenchQA][countbenchqa] | 26.1 | 17.8 | 68.0 | [coco-cap]: [docvqa]: [info-vqa]: [mmmu]: [textvqa]: [realworldqa]: [remi]: [ai2d]: [chartqa]: [vqav2]: [blinkvqa]: [okvqa]: [tallyqa]: [ss-vqa]: [countbenchqa]: ## Ethics and Safety Ethics and safety evaluation approach and results. ### Evaluation Approach Our evaluation methods include structured evaluations and internal red-teaming testing of relevant content policies. Red-teaming was conducted by a number of different teams, each with different goals and human evaluation metrics. These models were evaluated against a number of different categories relevant to ethics and safety, including: - **Child Safety**: Evaluation of text-to-text and image to text prompts covering child safety policies, including child sexual abuse and exploitation. - **Content Safety:** Evaluation of text-to-text and image to text prompts covering safety policies including, harassment, violence and gore, and hate speech. - **Representational Harms**: Evaluation of text-to-text and image to text prompts covering safety policies including bias, stereotyping, and harmful associations or inaccuracies. In addition to development level evaluations, we conduct \"assurance evaluations\" which are our 'arms-length' internal evaluations for responsibility governance decision making. They are conducted separately from the model development team, to inform decision making about release. High level findings are fed back to the model team, but prompt sets are held-out to prevent overfitting and preserve the results' ability to inform decision making. Assurance evaluation results are reported to our Responsibility & Safety Council as part of release review. ### Evaluation Results For all areas of safety testing, we saw major improvements in the categories of child safety, content safety, and representational harms relative to previous Gemma models. All testing was conducted without safety filters to evaluate the model capabilities and behaviors. For both text-to-text and image-to-text, and across all model sizes, the model produced minimal policy violations, and showed significant improvements over previous Gemma models' performance with respect to ungrounded inferences. A limitation of our evaluations was they included only English language prompts. ## Usage and Limitations These models have certain limitations that users should be aware of. ### Intended Usage Open vision-language models (VLMs) models have a wide range of applications across various industries and domains. The following list of potential uses is not comprehensive. The purpose of this list is to provide contextual information about the possible use-cases that the model creators considered as part of model training and development. - Content Creation and Communication - Text Generation: These models can be used to generate creative text formats such as poems, scripts, code, marketing copy, and email drafts. - Chatbots and Conversational AI: Power conversational interfaces for customer service, virtual assistants, or interactive applications. - Text Summarization: Generate concise summaries of a text corpus, research papers, or reports. - Image Data Extraction: These models can be used to extract, interpret, and summarize visual data for text communications. - Research and Education - Natural Language Processing (NLP) and VLM Research: These models can serve as a foundation for researchers to experiment with VLM and NLP techniques, develop algorithms, and contribute to the advancement of the field. - Language Learning Tools: Support interactive language learning experiences, aiding in grammar correction or providing writing practice. - Knowledge Exploration: Assist researchers in exploring large bodies of text by generating summaries or answering questions about specific topics. ### Limitations - Training Data - The quality and diversity of the training data significantly influence the model's capabilities. Biases or gaps in the training data can lead to limitations in the model's responses. - The scope of the training dataset determines the subject areas the model can handle effectively. - Context and Task Complexity - Models are better at tasks that can be framed with clear prompts and instructions. Open-ended or highly complex tasks might be challenging. - A model's performance can be influenced by the amount of context provided (longer context generally leads to better outputs, up to a certain point). - Language Ambiguity and Nuance - Natural language is inherently complex. Models might struggle to grasp subtle nuances, sarcasm, or figurative language. - Factual Accuracy - Models generate responses based on information they learned from their training datasets, but they are not knowledge bases. They may generate incorrect or outdated factual statements. - Common Sense - Models rely on statistical patterns in language. They might lack the ability to apply common sense reasoning in certain situations. ### Ethical Considerations and Risks The development of vision-language models (VLMs) raises several ethical concerns. In creating an open model, we have carefully considered the following: - Bias and Fairness - VLMs trained on large-scale, real-world text and image data can reflect socio-cultural biases embedded in the training material. These models underwent careful scrutiny, input data pre-processing described and posterior evaluations reported in this card. - Misinformation and Misuse - VLMs can be misused to generate text that is false, misleading, or harmful. - Guidelines are provided for responsible use with the model, see the [Responsible Generative AI Toolkit][rai-toolkit]. - Transparency and Accountability: - This model card summarizes details on the models' architecture, capabilities, limitations, and evaluation processes. - A responsibly developed open model offers the opportunity to share innovation by making VLM technology accessible to developers and researchers across the AI ecosystem. Risks identified and mitigations: - **Perpetuation of biases**: It's encouraged to perform continuous monitoring (using evaluation metrics, human review) and the exploration of de-biasing techniques during model training, fine-tuning, and other use cases. - **Generation of harmful content**: Mechanisms and guidelines for content safety are essential. Developers are encouraged to exercise caution and implement appropriate content safety safeguards based on their specific product policies and application use cases. - **Misuse for malicious purposes**: Technical limitations and developer and end-user education can help mitigate against malicious applications of VLMs. Educational resources and reporting mechanisms for users to flag misuse are provided. Prohibited uses of Gemma models are outlined in the [Gemma Prohibited Use Policy][prohibited-use]. - **Privacy violations**: Models were trained on data filtered for removal of certain personal information and other sensitive data. Developers are encouraged to adhere to privacy regulations with privacy-preserving techniques. ### Benefits At the time of release, this family of models provides high-performance open vision-language model implementations designed from the ground up for responsible AI development compared to similarly sized models. Using the benchmark evaluation metrics described in this document, these models have shown to provide superior performance to other, comparably-sized open model alternatives. [g3-tech-report]: [rai-toolkit]: [kaggle-gemma]: [vertex-mg-gemma3]: [terms]: [safety-policies]: [prohibited-use]: [tpu]: [sustainability]: [jax]: [ml-pathways]: [sustainability]: [gemini-2-paper]:",
+ "model_explanation_gemini": "A 27B-parameter instruction-tuned multimodal Gemma 3 model quantized to 4-bit (GGUF Q4_0) for efficient deployment, handling text/image inputs (128K context) to generate text outputs for tasks like QA, summarization, and image understanding.\n\n**Features**: \n- **Multimodal**: Processes text and images (896x896 resolution). \n- **Quantized**: 4-bit GGUF format via QAT reduces memory usage while preserving quality."
+}
\ No newline at end of file
diff --git a/model_data_json/google_gemma-3-27b-it.json b/model_data_json/google_gemma-3-27b-it.json
new file mode 100644
index 0000000000000000000000000000000000000000..bd344d189a49bee952b64237d8e519e541418cf7
--- /dev/null
+++ b/model_data_json/google_gemma-3-27b-it.json
@@ -0,0 +1,47 @@
+{
+ "model_id": "google/gemma-3-27b-it",
+ "downloads": 395187,
+ "tags": [
+ "transformers",
+ "safetensors",
+ "gemma3",
+ "image-text-to-text",
+ "conversational",
+ "arxiv:1905.07830",
+ "arxiv:1905.10044",
+ "arxiv:1911.11641",
+ "arxiv:1904.09728",
+ "arxiv:1705.03551",
+ "arxiv:1911.01547",
+ "arxiv:1907.10641",
+ "arxiv:1903.00161",
+ "arxiv:2009.03300",
+ "arxiv:2304.06364",
+ "arxiv:2103.03874",
+ "arxiv:2110.14168",
+ "arxiv:2311.12022",
+ "arxiv:2108.07732",
+ "arxiv:2107.03374",
+ "arxiv:2210.03057",
+ "arxiv:2106.03193",
+ "arxiv:1910.11856",
+ "arxiv:2502.12404",
+ "arxiv:2502.21228",
+ "arxiv:2404.16816",
+ "arxiv:2104.12756",
+ "arxiv:2311.16502",
+ "arxiv:2203.10244",
+ "arxiv:2404.12390",
+ "arxiv:1810.12440",
+ "arxiv:1908.02660",
+ "arxiv:2312.11805",
+ "base_model:google/gemma-3-27b-pt",
+ "base_model:finetune:google/gemma-3-27b-pt",
+ "license:gemma",
+ "text-generation-inference",
+ "endpoints_compatible",
+ "region:us"
+ ],
+ "description": "--- license: gemma library_name: transformers pipeline_tag: image-text-to-text extra_gated_heading: Access Gemma on Hugging Face extra_gated_prompt: To access Gemma on Hugging Face, you’re required to review and agree to Google’s usage license. To do this, please ensure you’re logged in to Hugging Face and click below. Requests are processed immediately. extra_gated_button_content: Acknowledge license base_model: google/gemma-3-27b-pt --- # Gemma 3 model card **Model Page**: Gemma **Resources and Technical Documentation**: * [Gemma 3 Technical Report][g3-tech-report] * [Responsible Generative AI Toolkit][rai-toolkit] * [Gemma on Kaggle][kaggle-gemma] * [Gemma on Vertex Model Garden][vertex-mg-gemma3] **Terms of Use**: [Terms][terms] **Authors**: Google DeepMind ## Model Information Summary description and brief definition of inputs and outputs. ### Description Gemma is a family of lightweight, state-of-the-art open models from Google, built from the same research and technology used to create the Gemini models. Gemma 3 models are multimodal, handling text and image input and generating text output, with open weights for both pre-trained variants and instruction-tuned variants. Gemma 3 has a large, 128K context window, multilingual support in over 140 languages, and is available in more sizes than previous versions. Gemma 3 models are well-suited for a variety of text generation and image understanding tasks, including question answering, summarization, and reasoning. Their relatively small size makes it possible to deploy them in environments with limited resources such as laptops, desktops or your own cloud infrastructure, democratizing access to state of the art AI models and helping foster innovation for everyone. ### Inputs and outputs - **Input:** - Text string, such as a question, a prompt, or a document to be summarized - Images, normalized to 896 x 896 resolution and encoded to 256 tokens each - Total input context of 128K tokens for the 4B, 12B, and 27B sizes, and 32K tokens for the 1B size - **Output:** - Generated text in response to the input, such as an answer to a question, analysis of image content, or a summary of a document - Total output context of 8192 tokens ### Usage Below there are some code snippets on how to get quickly started with running the model. First, install the Transformers library. Gemma 3 is supported starting from transformers 4.50.0. Then, copy the snippet from the section that is relevant for your use case. #### Running with the API You can initialize the model and processor for inference with as follows. With instruction-tuned models, you need to use chat templates to process our inputs first. Then, you can pass it to the pipeline. #### Running the model on a single/multi GPU ### Citation ## Model Data Data used for model training and how the data was processed. ### Training Dataset These models were trained on a dataset of text data that includes a wide variety of sources. The 27B model was trained with 14 trillion tokens, the 12B model was trained with 12 trillion tokens, 4B model was trained with 4 trillion tokens and 1B with 2 trillion tokens. Here are the key components: - Web Documents: A diverse collection of web text ensures the model is exposed to a broad range of linguistic styles, topics, and vocabulary. The training dataset includes content in over 140 languages. - Code: Exposing the model to code helps it to learn the syntax and patterns of programming languages, which improves its ability to generate code and understand code-related questions. - Mathematics: Training on mathematical text helps the model learn logical reasoning, symbolic representation, and to address mathematical queries. - Images: A wide range of images enables the model to perform image analysis and visual data extraction tasks. The combination of these diverse data sources is crucial for training a powerful multimodal model that can handle a wide variety of different tasks and data formats. ### Data Preprocessing Here are the key data cleaning and filtering methods applied to the training data: - CSAM Filtering: Rigorous CSAM (Child Sexual Abuse Material) filtering was applied at multiple stages in the data preparation process to ensure the exclusion of harmful and illegal content. - Sensitive Data Filtering: As part of making Gemma pre-trained models safe and reliable, automated techniques were used to filter out certain personal information and other sensitive data from training sets. - Additional methods: Filtering based on content quality and safety in line with [our policies][safety-policies]. ## Implementation Information Details about the model internals. ### Hardware Gemma was trained using [Tensor Processing Unit (TPU)][tpu] hardware (TPUv4p, TPUv5p and TPUv5e). Training vision-language models (VLMS) requires significant computational power. TPUs, designed specifically for matrix operations common in machine learning, offer several advantages in this domain: - Performance: TPUs are specifically designed to handle the massive computations involved in training VLMs. They can speed up training considerably compared to CPUs. - Memory: TPUs often come with large amounts of high-bandwidth memory, allowing for the handling of large models and batch sizes during training. This can lead to better model quality. - Scalability: TPU Pods (large clusters of TPUs) provide a scalable solution for handling the growing complexity of large foundation models. You can distribute training across multiple TPU devices for faster and more efficient processing. - Cost-effectiveness: In many scenarios, TPUs can provide a more cost-effective solution for training large models compared to CPU-based infrastructure, especially when considering the time and resources saved due to faster training. - These advantages are aligned with [Google's commitments to operate sustainably][sustainability]. ### Software Training was done using [JAX][jax] and [ML Pathways][ml-pathways]. JAX allows researchers to take advantage of the latest generation of hardware, including TPUs, for faster and more efficient training of large models. ML Pathways is Google's latest effort to build artificially intelligent systems capable of generalizing across multiple tasks. This is specially suitable for foundation models, including large language models like these ones. Together, JAX and ML Pathways are used as described in the [paper about the Gemini family of models][gemini-2-paper]; *\"the 'single controller' programming model of Jax and Pathways allows a single Python process to orchestrate the entire training run, dramatically simplifying the development workflow.\"* ## Evaluation Model evaluation metrics and results. ### Benchmark Results These models were evaluated against a large collection of different datasets and metrics to cover different aspects of text generation: #### Reasoning and factuality | Benchmark | Metric | Gemma 3 PT 1B | Gemma 3 PT 4B | Gemma 3 PT 12B | Gemma 3 PT 27B | | ------------------------------ |----------------|:--------------:|:-------------:|:--------------:|:--------------:| | [HellaSwag][hellaswag] | 10-shot | 62.3 | 77.2 | 84.2 | 85.6 | | [BoolQ][boolq] | 0-shot | 63.2 | 72.3 | 78.8 | 82.4 | | [PIQA][piqa] | 0-shot | 73.8 | 79.6 | 81.8 | 83.3 | | [SocialIQA][socialiqa] | 0-shot | 48.9 | 51.9 | 53.4 | 54.9 | | [TriviaQA][triviaqa] | 5-shot | 39.8 | 65.8 | 78.2 | 85.5 | | [Natural Questions][naturalq] | 5-shot | 9.48 | 20.0 | 31.4 | 36.1 | | [ARC-c][arc] | 25-shot | 38.4 | 56.2 | 68.9 | 70.6 | | [ARC-e][arc] | 0-shot | 73.0 | 82.4 | 88.3 | 89.0 | | [WinoGrande][winogrande] | 5-shot | 58.2 | 64.7 | 74.3 | 78.8 | | [BIG-Bench Hard][bbh] | few-shot | 28.4 | 50.9 | 72.6 | 77.7 | | [DROP][drop] | 1-shot | 42.4 | 60.1 | 72.2 | 77.2 | [hellaswag]: [boolq]: [piqa]: [socialiqa]: [triviaqa]: [naturalq]: [arc]: [winogrande]: [bbh]: [drop]: #### STEM and code | Benchmark | Metric | Gemma 3 PT 4B | Gemma 3 PT 12B | Gemma 3 PT 27B | | ------------------------------ |----------------|:-------------:|:--------------:|:--------------:| | [MMLU][mmlu] | 5-shot | 59.6 | 74.5 | 78.6 | | [MMLU][mmlu] (Pro COT) | 5-shot | 29.2 | 45.3 | 52.2 | | [AGIEval][agieval] | 3-5-shot | 42.1 | 57.4 | 66.2 | | [MATH][math] | 4-shot | 24.2 | 43.3 | 50.0 | | [GSM8K][gsm8k] | 8-shot | 38.4 | 71.0 | 82.6 | | [GPQA][gpqa] | 5-shot | 15.0 | 25.4 | 24.3 | | [MBPP][mbpp] | 3-shot | 46.0 | 60.4 | 65.6 | | [HumanEval][humaneval] | 0-shot | 36.0 | 45.7 | 48.8 | [mmlu]: [agieval]: [math]: [gsm8k]: [gpqa]: [mbpp]: [humaneval]: #### Multilingual | Benchmark | Gemma 3 PT 1B | Gemma 3 PT 4B | Gemma 3 PT 12B | Gemma 3 PT 27B | | ------------------------------------ |:-------------:|:-------------:|:--------------:|:--------------:| | [MGSM][mgsm] | 2.04 | 34.7 | 64.3 | 74.3 | | [Global-MMLU-Lite][global-mmlu-lite] | 24.9 | 57.0 | 69.4 | 75.7 | | [WMT24++][wmt24pp] (ChrF) | 36.7 | 48.4 | 53.9 | 55.7 | | [FloRes][flores] | 29.5 | 39.2 | 46.0 | 48.8 | | [XQuAD][xquad] (all) | 43.9 | 68.0 | 74.5 | 76.8 | | [ECLeKTic][eclektic] | 4.69 | 11.0 | 17.2 | 24.4 | | [IndicGenBench][indicgenbench] | 41.4 | 57.2 | 61.7 | 63.4 | [mgsm]: [flores]: [xquad]: [global-mmlu-lite]: [wmt24pp]: [eclektic]: [indicgenbench]: #### Multimodal | Benchmark | Gemma 3 PT 4B | Gemma 3 PT 12B | Gemma 3 PT 27B | | ------------------------------ |:-------------:|:--------------:|:--------------:| | [COCOcap][coco-cap] | 102 | 111 | 116 | | [DocVQA][docvqa] (val) | 72.8 | 82.3 | 85.6 | | [InfoVQA][info-vqa] (val) | 44.1 | 54.8 | 59.4 | | [MMMU][mmmu] (pt) | 39.2 | 50.3 | 56.1 | | [TextVQA][textvqa] (val) | 58.9 | 66.5 | 68.6 | | [RealWorldQA][realworldqa] | 45.5 | 52.2 | 53.9 | | [ReMI][remi] | 27.3 | 38.5 | 44.8 | | [AI2D][ai2d] | 63.2 | 75.2 | 79.0 | | [ChartQA][chartqa] | 63.6 | 74.7 | 76.3 | | [VQAv2][vqav2] | 63.9 | 71.2 | 72.9 | | [BLINK][blinkvqa] | 38.0 | 35.9 | 39.6 | | [OKVQA][okvqa] | 51.0 | 58.7 | 60.2 | | [TallyQA][tallyqa] | 42.5 | 51.8 | 54.3 | | [SpatialSense VQA][ss-vqa] | 50.9 | 60.0 | 59.4 | | [CountBenchQA][countbenchqa] | 26.1 | 17.8 | 68.0 | [coco-cap]: [docvqa]: [info-vqa]: [mmmu]: [textvqa]: [realworldqa]: [remi]: [ai2d]: [chartqa]: [vqav2]: [blinkvqa]: [okvqa]: [tallyqa]: [ss-vqa]: [countbenchqa]: ## Ethics and Safety Ethics and safety evaluation approach and results. ### Evaluation Approach Our evaluation methods include structured evaluations and internal red-teaming testing of relevant content policies. Red-teaming was conducted by a number of different teams, each with different goals and human evaluation metrics. These models were evaluated against a number of different categories relevant to ethics and safety, including: - **Child Safety**: Evaluation of text-to-text and image to text prompts covering child safety policies, including child sexual abuse and exploitation. - **Content Safety:** Evaluation of text-to-text and image to text prompts covering safety policies including, harassment, violence and gore, and hate speech. - **Representational Harms**: Evaluation of text-to-text and image to text prompts covering safety policies including bias, stereotyping, and harmful associations or inaccuracies. In addition to development level evaluations, we conduct \"assurance evaluations\" which are our 'arms-length' internal evaluations for responsibility governance decision making. They are conducted separately from the model development team, to inform decision making about release. High level findings are fed back to the model team, but prompt sets are held-out to prevent overfitting and preserve the results' ability to inform decision making. Assurance evaluation results are reported to our Responsibility & Safety Council as part of release review. ### Evaluation Results For all areas of safety testing, we saw major improvements in the categories of child safety, content safety, and representational harms relative to previous Gemma models. All testing was conducted without safety filters to evaluate the model capabilities and behaviors. For both text-to-text and image-to-text, and across all model sizes, the model produced minimal policy violations, and showed significant improvements over previous Gemma models' performance with respect to ungrounded inferences. A limitation of our evaluations was they included only English language prompts. ## Usage and Limitations These models have certain limitations that users should be aware of. ### Intended Usage Open vision-language models (VLMs) models have a wide range of applications across various industries and domains. The following list of potential uses is not comprehensive. The purpose of this list is to provide contextual information about the possible use-cases that the model creators considered as part of model training and development. - Content Creation and Communication - Text Generation: These models can be used to generate creative text formats such as poems, scripts, code, marketing copy, and email drafts. - Chatbots and Conversational AI: Power conversational interfaces for customer service, virtual assistants, or interactive applications. - Text Summarization: Generate concise summaries of a text corpus, research papers, or reports. - Image Data Extraction: These models can be used to extract, interpret, and summarize visual data for text communications. - Research and Education - Natural Language Processing (NLP) and VLM Research: These models can serve as a foundation for researchers to experiment with VLM and NLP techniques, develop algorithms, and contribute to the advancement of the field. - Language Learning Tools: Support interactive language learning experiences, aiding in grammar correction or providing writing practice. - Knowledge Exploration: Assist researchers in exploring large bodies of text by generating summaries or answering questions about specific topics. ### Limitations - Training Data - The quality and diversity of the training data significantly influence the model's capabilities. Biases or gaps in the training data can lead to limitations in the model's responses. - The scope of the training dataset determines the subject areas the model can handle effectively. - Context and Task Complexity - Models are better at tasks that can be framed with clear prompts and instructions. Open-ended or highly complex tasks might be challenging. - A model's performance can be influenced by the amount of context provided (longer context generally leads to better outputs, up to a certain point). - Language Ambiguity and Nuance - Natural language is inherently complex. Models might struggle to grasp subtle nuances, sarcasm, or figurative language. - Factual Accuracy - Models generate responses based on information they learned from their training datasets, but they are not knowledge bases. They may generate incorrect or outdated factual statements. - Common Sense - Models rely on statistical patterns in language. They might lack the ability to apply common sense reasoning in certain situations. ### Ethical Considerations and Risks The development of vision-language models (VLMs) raises several ethical concerns. In creating an open model, we have carefully considered the following: - Bias and Fairness - VLMs trained on large-scale, real-world text and image data can reflect socio-cultural biases embedded in the training material. These models underwent careful scrutiny, input data pre-processing described and posterior evaluations reported in this card. - Misinformation and Misuse - VLMs can be misused to generate text that is false, misleading, or harmful. - Guidelines are provided for responsible use with the model, see the [Responsible Generative AI Toolkit][rai-toolkit]. - Transparency and Accountability: - This model card summarizes details on the models' architecture, capabilities, limitations, and evaluation processes. - A responsibly developed open model offers the opportunity to share innovation by making VLM technology accessible to developers and researchers across the AI ecosystem. Risks identified and mitigations: - **Perpetuation of biases**: It's encouraged to perform continuous monitoring (using evaluation metrics, human review) and the exploration of de-biasing techniques during model training, fine-tuning, and other use cases. - **Generation of harmful content**: Mechanisms and guidelines for content safety are essential. Developers are encouraged to exercise caution and implement appropriate content safety safeguards based on their specific product policies and application use cases. - **Misuse for malicious purposes**: Technical limitations and developer and end-user education can help mitigate against malicious applications of VLMs. Educational resources and reporting mechanisms for users to flag misuse are provided. Prohibited uses of Gemma models are outlined in the [Gemma Prohibited Use Policy][prohibited-use]. - **Privacy violations**: Models were trained on data filtered for removal of certain personal information and other sensitive data. Developers are encouraged to adhere to privacy regulations with privacy-preserving techniques. ### Benefits At the time of release, this family of models provides high-performance open vision-language model implementations designed from the ground up for responsible AI development compared to similarly sized models. Using the benchmark evaluation metrics described in this document, these models have shown to provide superior performance to other, comparably-sized open model alternatives. [g3-tech-report]: [rai-toolkit]: [kaggle-gemma]: [vertex-mg-gemma3]: [terms]: [safety-policies]: [prohibited-use]: [tpu]: [sustainability]: [jax]: [ml-pathways]: [sustainability]: [gemini-2-paper]:",
+ "model_explanation_gemini": "\"google_gemma-3-27b-it\" is a lightweight, multimodal AI model that processes text and images (896x896 resolution) to generate text outputs, supporting 140+ languages and a 128K token context window for tasks like question answering, summarization, and reasoning. \n\n**Features**: \n- Multimodal (text + image inputs, text output) \n- 128K token context window (27B size) \n- Supports 140+ languages \n-"
+}
\ No newline at end of file
diff --git a/model_data_json/google_gemma-3-4b-it.json b/model_data_json/google_gemma-3-4b-it.json
new file mode 100644
index 0000000000000000000000000000000000000000..80d5844dbc92ef0729f7bc881bc0fc0e174b2ecf
--- /dev/null
+++ b/model_data_json/google_gemma-3-4b-it.json
@@ -0,0 +1,47 @@
+{
+ "model_id": "google/gemma-3-4b-it",
+ "downloads": 576385,
+ "tags": [
+ "transformers",
+ "safetensors",
+ "gemma3",
+ "image-text-to-text",
+ "conversational",
+ "arxiv:1905.07830",
+ "arxiv:1905.10044",
+ "arxiv:1911.11641",
+ "arxiv:1904.09728",
+ "arxiv:1705.03551",
+ "arxiv:1911.01547",
+ "arxiv:1907.10641",
+ "arxiv:1903.00161",
+ "arxiv:2009.03300",
+ "arxiv:2304.06364",
+ "arxiv:2103.03874",
+ "arxiv:2110.14168",
+ "arxiv:2311.12022",
+ "arxiv:2108.07732",
+ "arxiv:2107.03374",
+ "arxiv:2210.03057",
+ "arxiv:2106.03193",
+ "arxiv:1910.11856",
+ "arxiv:2502.12404",
+ "arxiv:2502.21228",
+ "arxiv:2404.16816",
+ "arxiv:2104.12756",
+ "arxiv:2311.16502",
+ "arxiv:2203.10244",
+ "arxiv:2404.12390",
+ "arxiv:1810.12440",
+ "arxiv:1908.02660",
+ "arxiv:2312.11805",
+ "base_model:google/gemma-3-4b-pt",
+ "base_model:finetune:google/gemma-3-4b-pt",
+ "license:gemma",
+ "text-generation-inference",
+ "endpoints_compatible",
+ "region:us"
+ ],
+ "description": "--- license: gemma library_name: transformers pipeline_tag: image-text-to-text extra_gated_heading: Access Gemma on Hugging Face extra_gated_prompt: To access Gemma on Hugging Face, you’re required to review and agree to Google’s usage license. To do this, please ensure you’re logged in to Hugging Face and click below. Requests are processed immediately. extra_gated_button_content: Acknowledge license base_model: google/gemma-3-4b-pt --- # Gemma 3 model card **Model Page**: Gemma **Resources and Technical Documentation**: * [Gemma 3 Technical Report][g3-tech-report] * [Responsible Generative AI Toolkit][rai-toolkit] * [Gemma on Kaggle][kaggle-gemma] * [Gemma on Vertex Model Garden][vertex-mg-gemma3] **Terms of Use**: [Terms][terms] **Authors**: Google DeepMind ## Model Information Summary description and brief definition of inputs and outputs. ### Description Gemma is a family of lightweight, state-of-the-art open models from Google, built from the same research and technology used to create the Gemini models. Gemma 3 models are multimodal, handling text and image input and generating text output, with open weights for both pre-trained variants and instruction-tuned variants. Gemma 3 has a large, 128K context window, multilingual support in over 140 languages, and is available in more sizes than previous versions. Gemma 3 models are well-suited for a variety of text generation and image understanding tasks, including question answering, summarization, and reasoning. Their relatively small size makes it possible to deploy them in environments with limited resources such as laptops, desktops or your own cloud infrastructure, democratizing access to state of the art AI models and helping foster innovation for everyone. ### Inputs and outputs - **Input:** - Text string, such as a question, a prompt, or a document to be summarized - Images, normalized to 896 x 896 resolution and encoded to 256 tokens each - Total input context of 128K tokens for the 4B, 12B, and 27B sizes, and 32K tokens for the 1B size - **Output:** - Generated text in response to the input, such as an answer to a question, analysis of image content, or a summary of a document - Total output context of 8192 tokens ### Usage Below, there are some code snippets on how to get quickly started with running the model. First, install the Transformers library. Gemma 3 is supported starting from transformers 4.50.0. Then, copy the snippet from the section that is relevant for your use case. #### Running with the API You can initialize the model and processor for inference with as follows. With instruction-tuned models, you need to use chat templates to process our inputs first. Then, you can pass it to the pipeline. #### Running the model on a single/multi GPU ### Citation ## Model Data Data used for model training and how the data was processed. ### Training Dataset These models were trained on a dataset of text data that includes a wide variety of sources. The 27B model was trained with 14 trillion tokens, the 12B model was trained with 12 trillion tokens, 4B model was trained with 4 trillion tokens and 1B with 2 trillion tokens. Here are the key components: - Web Documents: A diverse collection of web text ensures the model is exposed to a broad range of linguistic styles, topics, and vocabulary. The training dataset includes content in over 140 languages. - Code: Exposing the model to code helps it to learn the syntax and patterns of programming languages, which improves its ability to generate code and understand code-related questions. - Mathematics: Training on mathematical text helps the model learn logical reasoning, symbolic representation, and to address mathematical queries. - Images: A wide range of images enables the model to perform image analysis and visual data extraction tasks. The combination of these diverse data sources is crucial for training a powerful multimodal model that can handle a wide variety of different tasks and data formats. ### Data Preprocessing Here are the key data cleaning and filtering methods applied to the training data: - CSAM Filtering: Rigorous CSAM (Child Sexual Abuse Material) filtering was applied at multiple stages in the data preparation process to ensure the exclusion of harmful and illegal content. - Sensitive Data Filtering: As part of making Gemma pre-trained models safe and reliable, automated techniques were used to filter out certain personal information and other sensitive data from training sets. - Additional methods: Filtering based on content quality and safety in line with [our policies][safety-policies]. ## Implementation Information Details about the model internals. ### Hardware Gemma was trained using [Tensor Processing Unit (TPU)][tpu] hardware (TPUv4p, TPUv5p and TPUv5e). Training vision-language models (VLMS) requires significant computational power. TPUs, designed specifically for matrix operations common in machine learning, offer several advantages in this domain: - Performance: TPUs are specifically designed to handle the massive computations involved in training VLMs. They can speed up training considerably compared to CPUs. - Memory: TPUs often come with large amounts of high-bandwidth memory, allowing for the handling of large models and batch sizes during training. This can lead to better model quality. - Scalability: TPU Pods (large clusters of TPUs) provide a scalable solution for handling the growing complexity of large foundation models. You can distribute training across multiple TPU devices for faster and more efficient processing. - Cost-effectiveness: In many scenarios, TPUs can provide a more cost-effective solution for training large models compared to CPU-based infrastructure, especially when considering the time and resources saved due to faster training. - These advantages are aligned with [Google's commitments to operate sustainably][sustainability]. ### Software Training was done using [JAX][jax] and [ML Pathways][ml-pathways]. JAX allows researchers to take advantage of the latest generation of hardware, including TPUs, for faster and more efficient training of large models. ML Pathways is Google's latest effort to build artificially intelligent systems capable of generalizing across multiple tasks. This is specially suitable for foundation models, including large language models like these ones. Together, JAX and ML Pathways are used as described in the [paper about the Gemini family of models][gemini-2-paper]; *\"the 'single controller' programming model of Jax and Pathways allows a single Python process to orchestrate the entire training run, dramatically simplifying the development workflow.\"* ## Evaluation Model evaluation metrics and results. ### Benchmark Results These models were evaluated against a large collection of different datasets and metrics to cover different aspects of text generation: #### Reasoning and factuality | Benchmark | Metric | Gemma 3 PT 1B | Gemma 3 PT 4B | Gemma 3 PT 12B | Gemma 3 PT 27B | | ------------------------------ |----------------|:--------------:|:-------------:|:--------------:|:--------------:| | [HellaSwag][hellaswag] | 10-shot | 62.3 | 77.2 | 84.2 | 85.6 | | [BoolQ][boolq] | 0-shot | 63.2 | 72.3 | 78.8 | 82.4 | | [PIQA][piqa] | 0-shot | 73.8 | 79.6 | 81.8 | 83.3 | | [SocialIQA][socialiqa] | 0-shot | 48.9 | 51.9 | 53.4 | 54.9 | | [TriviaQA][triviaqa] | 5-shot | 39.8 | 65.8 | 78.2 | 85.5 | | [Natural Questions][naturalq] | 5-shot | 9.48 | 20.0 | 31.4 | 36.1 | | [ARC-c][arc] | 25-shot | 38.4 | 56.2 | 68.9 | 70.6 | | [ARC-e][arc] | 0-shot | 73.0 | 82.4 | 88.3 | 89.0 | | [WinoGrande][winogrande] | 5-shot | 58.2 | 64.7 | 74.3 | 78.8 | | [BIG-Bench Hard][bbh] | few-shot | 28.4 | 50.9 | 72.6 | 77.7 | | [DROP][drop] | 1-shot | 42.4 | 60.1 | 72.2 | 77.2 | [hellaswag]: [boolq]: [piqa]: [socialiqa]: [triviaqa]: [naturalq]: [arc]: [winogrande]: [bbh]: [drop]: #### STEM and code | Benchmark | Metric | Gemma 3 PT 4B | Gemma 3 PT 12B | Gemma 3 PT 27B | | ------------------------------ |----------------|:-------------:|:--------------:|:--------------:| | [MMLU][mmlu] | 5-shot | 59.6 | 74.5 | 78.6 | | [MMLU][mmlu] (Pro COT) | 5-shot | 29.2 | 45.3 | 52.2 | | [AGIEval][agieval] | 3-5-shot | 42.1 | 57.4 | 66.2 | | [MATH][math] | 4-shot | 24.2 | 43.3 | 50.0 | | [GSM8K][gsm8k] | 8-shot | 38.4 | 71.0 | 82.6 | | [GPQA][gpqa] | 5-shot | 15.0 | 25.4 | 24.3 | | [MBPP][mbpp] | 3-shot | 46.0 | 60.4 | 65.6 | | [HumanEval][humaneval] | 0-shot | 36.0 | 45.7 | 48.8 | [mmlu]: [agieval]: [math]: [gsm8k]: [gpqa]: [mbpp]: [humaneval]: #### Multilingual | Benchmark | Gemma 3 PT 1B | Gemma 3 PT 4B | Gemma 3 PT 12B | Gemma 3 PT 27B | | ------------------------------------ |:-------------:|:-------------:|:--------------:|:--------------:| | [MGSM][mgsm] | 2.04 | 34.7 | 64.3 | 74.3 | | [Global-MMLU-Lite][global-mmlu-lite] | 24.9 | 57.0 | 69.4 | 75.7 | | [WMT24++][wmt24pp] (ChrF) | 36.7 | 48.4 | 53.9 | 55.7 | | [FloRes][flores] | 29.5 | 39.2 | 46.0 | 48.8 | | [XQuAD][xquad] (all) | 43.9 | 68.0 | 74.5 | 76.8 | | [ECLeKTic][eclektic] | 4.69 | 11.0 | 17.2 | 24.4 | | [IndicGenBench][indicgenbench] | 41.4 | 57.2 | 61.7 | 63.4 | [mgsm]: [flores]: [xquad]: [global-mmlu-lite]: [wmt24pp]: [eclektic]: [indicgenbench]: #### Multimodal | Benchmark | Gemma 3 PT 4B | Gemma 3 PT 12B | Gemma 3 PT 27B | | ------------------------------ |:-------------:|:--------------:|:--------------:| | [COCOcap][coco-cap] | 102 | 111 | 116 | | [DocVQA][docvqa] (val) | 72.8 | 82.3 | 85.6 | | [InfoVQA][info-vqa] (val) | 44.1 | 54.8 | 59.4 | | [MMMU][mmmu] (pt) | 39.2 | 50.3 | 56.1 | | [TextVQA][textvqa] (val) | 58.9 | 66.5 | 68.6 | | [RealWorldQA][realworldqa] | 45.5 | 52.2 | 53.9 | | [ReMI][remi] | 27.3 | 38.5 | 44.8 | | [AI2D][ai2d] | 63.2 | 75.2 | 79.0 | | [ChartQA][chartqa] | 63.6 | 74.7 | 76.3 | | [VQAv2][vqav2] | 63.9 | 71.2 | 72.9 | | [BLINK][blinkvqa] | 38.0 | 35.9 | 39.6 | | [OKVQA][okvqa] | 51.0 | 58.7 | 60.2 | | [TallyQA][tallyqa] | 42.5 | 51.8 | 54.3 | | [SpatialSense VQA][ss-vqa] | 50.9 | 60.0 | 59.4 | | [CountBenchQA][countbenchqa] | 26.1 | 17.8 | 68.0 | [coco-cap]: [docvqa]: [info-vqa]: [mmmu]: [textvqa]: [realworldqa]: [remi]: [ai2d]: [chartqa]: [vqav2]: [blinkvqa]: [okvqa]: [tallyqa]: [ss-vqa]: [countbenchqa]: ## Ethics and Safety Ethics and safety evaluation approach and results. ### Evaluation Approach Our evaluation methods include structured evaluations and internal red-teaming testing of relevant content policies. Red-teaming was conducted by a number of different teams, each with different goals and human evaluation metrics. These models were evaluated against a number of different categories relevant to ethics and safety, including: - **Child Safety**: Evaluation of text-to-text and image to text prompts covering child safety policies, including child sexual abuse and exploitation. - **Content Safety:** Evaluation of text-to-text and image to text prompts covering safety policies including, harassment, violence and gore, and hate speech. - **Representational Harms**: Evaluation of text-to-text and image to text prompts covering safety policies including bias, stereotyping, and harmful associations or inaccuracies. In addition to development level evaluations, we conduct \"assurance evaluations\" which are our 'arms-length' internal evaluations for responsibility governance decision making. They are conducted separately from the model development team, to inform decision making about release. High level findings are fed back to the model team, but prompt sets are held-out to prevent overfitting and preserve the results' ability to inform decision making. Assurance evaluation results are reported to our Responsibility & Safety Council as part of release review. ### Evaluation Results For all areas of safety testing, we saw major improvements in the categories of child safety, content safety, and representational harms relative to previous Gemma models. All testing was conducted without safety filters to evaluate the model capabilities and behaviors. For both text-to-text and image-to-text, and across all model sizes, the model produced minimal policy violations, and showed significant improvements over previous Gemma models' performance with respect to ungrounded inferences. A limitation of our evaluations was they included only English language prompts. ## Usage and Limitations These models have certain limitations that users should be aware of. ### Intended Usage Open vision-language models (VLMs) models have a wide range of applications across various industries and domains. The following list of potential uses is not comprehensive. The purpose of this list is to provide contextual information about the possible use-cases that the model creators considered as part of model training and development. - Content Creation and Communication - Text Generation: These models can be used to generate creative text formats such as poems, scripts, code, marketing copy, and email drafts. - Chatbots and Conversational AI: Power conversational interfaces for customer service, virtual assistants, or interactive applications. - Text Summarization: Generate concise summaries of a text corpus, research papers, or reports. - Image Data Extraction: These models can be used to extract, interpret, and summarize visual data for text communications. - Research and Education - Natural Language Processing (NLP) and VLM Research: These models can serve as a foundation for researchers to experiment with VLM and NLP techniques, develop algorithms, and contribute to the advancement of the field. - Language Learning Tools: Support interactive language learning experiences, aiding in grammar correction or providing writing practice. - Knowledge Exploration: Assist researchers in exploring large bodies of text by generating summaries or answering questions about specific topics. ### Limitations - Training Data - The quality and diversity of the training data significantly influence the model's capabilities. Biases or gaps in the training data can lead to limitations in the model's responses. - The scope of the training dataset determines the subject areas the model can handle effectively. - Context and Task Complexity - Models are better at tasks that can be framed with clear prompts and instructions. Open-ended or highly complex tasks might be challenging. - A model's performance can be influenced by the amount of context provided (longer context generally leads to better outputs, up to a certain point). - Language Ambiguity and Nuance - Natural language is inherently complex. Models might struggle to grasp subtle nuances, sarcasm, or figurative language. - Factual Accuracy - Models generate responses based on information they learned from their training datasets, but they are not knowledge bases. They may generate incorrect or outdated factual statements. - Common Sense - Models rely on statistical patterns in language. They might lack the ability to apply common sense reasoning in certain situations. ### Ethical Considerations and Risks The development of vision-language models (VLMs) raises several ethical concerns. In creating an open model, we have carefully considered the following: - Bias and Fairness - VLMs trained on large-scale, real-world text and image data can reflect socio-cultural biases embedded in the training material. These models underwent careful scrutiny, input data pre-processing described and posterior evaluations reported in this card. - Misinformation and Misuse - VLMs can be misused to generate text that is false, misleading, or harmful. - Guidelines are provided for responsible use with the model, see the [Responsible Generative AI Toolkit][rai-toolkit]. - Transparency and Accountability: - This model card summarizes details on the models' architecture, capabilities, limitations, and evaluation processes. - A responsibly developed open model offers the opportunity to share innovation by making VLM technology accessible to developers and researchers across the AI ecosystem. Risks identified and mitigations: - **Perpetuation of biases**: It's encouraged to perform continuous monitoring (using evaluation metrics, human review) and the exploration of de-biasing techniques during model training, fine-tuning, and other use cases. - **Generation of harmful content**: Mechanisms and guidelines for content safety are essential. Developers are encouraged to exercise caution and implement appropriate content safety safeguards based on their specific product policies and application use cases. - **Misuse for malicious purposes**: Technical limitations and developer and end-user education can help mitigate against malicious applications of VLMs. Educational resources and reporting mechanisms for users to flag misuse are provided. Prohibited uses of Gemma models are outlined in the [Gemma Prohibited Use Policy][prohibited-use]. - **Privacy violations**: Models were trained on data filtered for removal of certain personal information and other sensitive data. Developers are encouraged to adhere to privacy regulations with privacy-preserving techniques. ### Benefits At the time of release, this family of models provides high-performance open vision-language model implementations designed from the ground up for responsible AI development compared to similarly sized models. Using the benchmark evaluation metrics described in this document, these models have shown to provide superior performance to other, comparably-sized open model alternatives. [g3-tech-report]: [rai-toolkit]: [kaggle-gemma]: [vertex-mg-gemma3]: [terms]: [safety-policies]: [prohibited-use]: [tpu]: [sustainability]: [jax]: [ml-pathways]: [sustainability]: [gemini-2-paper]:",
+ "model_explanation_gemini": "Generates text outputs from text or image inputs, supporting multilingual tasks like question answering, summarization, and reasoning with a 128K context window. \n\n**Features:** \n- Multimodal (text/image input, text output) \n- 128K token context window (4B/12B/27B sizes) \n- Supports 140+ languages \n- Lightweight for local/cloud deployment \n- Instruction-tuned variants available \n\n**Comparison:** \nMore sizes and longer context than"
+}
\ No newline at end of file
diff --git a/model_data_json/google_gemma-7b-it.json b/model_data_json/google_gemma-7b-it.json
new file mode 100644
index 0000000000000000000000000000000000000000..d3be1b145d549b5502ea8fe0c2aef609ce2c98d0
--- /dev/null
+++ b/model_data_json/google_gemma-7b-it.json
@@ -0,0 +1,44 @@
+{
+ "model_id": "google/gemma-7b-it",
+ "downloads": 80370,
+ "tags": [
+ "transformers",
+ "safetensors",
+ "gguf",
+ "gemma",
+ "text-generation",
+ "conversational",
+ "arxiv:2312.11805",
+ "arxiv:2009.03300",
+ "arxiv:1905.07830",
+ "arxiv:1911.11641",
+ "arxiv:1904.09728",
+ "arxiv:1905.10044",
+ "arxiv:1907.10641",
+ "arxiv:1811.00937",
+ "arxiv:1809.02789",
+ "arxiv:1911.01547",
+ "arxiv:1705.03551",
+ "arxiv:2107.03374",
+ "arxiv:2108.07732",
+ "arxiv:2110.14168",
+ "arxiv:2304.06364",
+ "arxiv:2206.04615",
+ "arxiv:1804.06876",
+ "arxiv:2110.08193",
+ "arxiv:2009.11462",
+ "arxiv:2101.11718",
+ "arxiv:1804.09301",
+ "arxiv:2109.07958",
+ "arxiv:2203.09509",
+ "base_model:google/gemma-7b",
+ "base_model:finetune:google/gemma-7b",
+ "license:gemma",
+ "autotrain_compatible",
+ "text-generation-inference",
+ "endpoints_compatible",
+ "region:us"
+ ],
+ "description": "--- library_name: transformers license: gemma tags: [] widget: - messages: - role: user content: How does the brain work? inference: parameters: max_new_tokens: 200 extra_gated_heading: Access Gemma on Hugging Face extra_gated_prompt: To access Gemma on Hugging Face, you’re required to review and agree to Google’s usage license. To do this, please ensure you’re logged-in to Hugging Face and click below. Requests are processed immediately. extra_gated_button_content: Acknowledge license base_model: google/gemma-7b base_model_relation: finetune --- # Gemma Model Card **Model Page**: Gemma This model card corresponds to the 7B instruct version of the Gemma model. You can also visit the model card of the 2B base model, 7B base model, and 2B instruct model. **Resources and Technical Documentation**: * Responsible Generative AI Toolkit * Gemma on Kaggle * Gemma on Vertex Model Garden **Terms of Use**: Terms **Authors**: Google ## Model Information Summary description and brief definition of inputs and outputs. ### Description Gemma is a family of lightweight, state-of-the-art open models from Google, built from the same research and technology used to create the Gemini models. They are text-to-text, decoder-only large language models, available in English, with open weights, pre-trained variants, and instruction-tuned variants. Gemma models are well-suited for a variety of text generation tasks, including question answering, summarization, and reasoning. Their relatively small size makes it possible to deploy them in environments with limited resources such as a laptop, desktop or your own cloud infrastructure, democratizing access to state of the art AI models and helping foster innovation for everyone. ### Usage Below we share some code snippets on how to get quickly started with running the model. First make sure to , then copy the snippet from the section that is relevant for your usecase. #### Fine-tuning the model You can find fine-tuning scripts and notebook under the directory of []( repository. To adapt it to this model, simply change the model-id to . In that repository, we provide: * A script to perform Supervised Fine-Tuning (SFT) on UltraChat dataset using QLoRA * A script to perform SFT using FSDP on TPU devices * A notebook that you can run on a free-tier Google Colab instance to perform SFT on English quotes dataset #### Running the model on a CPU As explained below, we recommend as the default dtype. You can use a different precision if necessary. #### Running the model on a single / multi GPU #### Running the model on a GPU using different precisions The native weights of this model were exported in precision. You can use , which may be faster on certain hardware, indicating the when loading the model. For convenience, the revision of the repo contains a copy of the weights already converted to that precision. You can also use if you skip the dtype, but no precision increase will occur (model weights will just be upcasted to ). See examples below. * _Using _ * _Using _ * _Upcasting to _ #### Quantized Versions through * _Using 8-bit precision (int8)_ * _Using 4-bit precision_ #### Other optimizations * _Flash Attention 2_ First make sure to install in your environment ### Chat Template The instruction-tuned models use a chat template that must be adhered to for conversational use. The easiest way to apply it is using the tokenizer's built-in chat template, as shown in the following snippet. Let's load the model and apply the chat template to a conversation. In this example, we'll start with a single user interaction: At this point, the prompt contains the following text: As you can see, each turn is preceded by a delimiter and then the role of the entity (either , for content supplied by the user, or for LLM responses). Turns finish with the token. You can follow this format to build the prompt manually, if you need to do it without the tokenizer's chat template. After the prompt is ready, generation can be performed like this: ### Inputs and outputs * **Input:** Text string, such as a question, a prompt, or a document to be summarized. * **Output:** Generated English-language text in response to the input, such as an answer to a question, or a summary of a document. ## Model Data Data used for model training and how the data was processed. ### Training Dataset These models were trained on a dataset of text data that includes a wide variety of sources, totaling 6 trillion tokens. Here are the key components: * Web Documents: A diverse collection of web text ensures the model is exposed to a broad range of linguistic styles, topics, and vocabulary. Primarily English-language content. * Code: Exposing the model to code helps it to learn the syntax and patterns of programming languages, which improves its ability to generate code or understand code-related questions. * Mathematics: Training on mathematical text helps the model learn logical reasoning, symbolic representation, and to address mathematical queries. The combination of these diverse data sources is crucial for training a powerful language model that can handle a wide variety of different tasks and text formats. ### Data Preprocessing Here are the key data cleaning and filtering methods applied to the training data: * CSAM Filtering: Rigorous CSAM (Child Sexual Abuse Material) filtering was applied at multiple stages in the data preparation process to ensure the exclusion of harmful and illegal content * Sensitive Data Filtering: As part of making Gemma pre-trained models safe and reliable, automated techniques were used to filter out certain personal information and other sensitive data from training sets. * Additional methods: Filtering based on content quality and safely in line with our policies. ## Implementation Information Details about the model internals. ### Hardware Gemma was trained using the latest generation of Tensor Processing Unit (TPU) hardware (TPUv5e). Training large language models requires significant computational power. TPUs, designed specifically for matrix operations common in machine learning, offer several advantages in this domain: * Performance: TPUs are specifically designed to handle the massive computations involved in training LLMs. They can speed up training considerably compared to CPUs. * Memory: TPUs often come with large amounts of high-bandwidth memory, allowing for the handling of large models and batch sizes during training. This can lead to better model quality. * Scalability: TPU Pods (large clusters of TPUs) provide a scalable solution for handling the growing complexity of large foundation models. You can distribute training across multiple TPU devices for faster and more efficient processing. * Cost-effectiveness: In many scenarios, TPUs can provide a more cost-effective solution for training large models compared to CPU-based infrastructure, especially when considering the time and resources saved due to faster training. * These advantages are aligned with Google's commitments to operate sustainably. ### Software Training was done using JAX and ML Pathways. JAX allows researchers to take advantage of the latest generation of hardware, including TPUs, for faster and more efficient training of large models. ML Pathways is Google's latest effort to build artificially intelligent systems capable of generalizing across multiple tasks. This is specially suitable for foundation models, including large language models like these ones. Together, JAX and ML Pathways are used as described in the paper about the Gemini family of models; \"the 'single controller' programming model of Jax and Pathways allows a single Python process to orchestrate the entire training run, dramatically simplifying the development workflow.\" ## Evaluation Model evaluation metrics and results. ### Benchmark Results These models were evaluated against a large collection of different datasets and metrics to cover different aspects of text generation: | Benchmark | Metric | 2B Params | 7B Params | | ------------------------------ | ------------- | ----------- | --------- | | MMLU | 5-shot, top-1 | 42.3 | 64.3 | | HellaSwag | 0-shot |71.4 | 81.2 | | PIQA | 0-shot | 77.3 | 81.2 | | SocialIQA | 0-shot | 49.7 | 51.8 | | BooIQ | 0-shot | 69.4 | 83.2 | | WinoGrande | partial score | 65.4 | 72.3 | | CommonsenseQA | 7-shot | 65.3 | 71.3 | | OpenBookQA | | 47.8 | 52.8 | | ARC-e | | 73.2 | 81.5 | | ARC-c | | 42.1 | 53.2 | | TriviaQA | 5-shot | 53.2 | 63.4 | | Natural Questions | 5-shot | 12.5 | 23 | | HumanEval | pass@1 | 22.0 | 32.3 | | MBPP | 3-shot | 29.2 | 44.4 | | GSM8K | maj@1 | 17.7 | 46.4 | | MATH | 4-shot | 11.8 | 24.3 | | AGIEval | | 24.2 | 41.7 | | BIG-Bench | | 35.2 | 55.1 | | ------------------------------ | ------------- | ----------- | --------- | | **Average** | | **45.0** | **56.9** | ## Ethics and Safety Ethics and safety evaluation approach and results. ### Evaluation Approach Our evaluation methods include structured evaluations and internal red-teaming testing of relevant content policies. Red-teaming was conducted by a number of different teams, each with different goals and human evaluation metrics. These models were evaluated against a number of different categories relevant to ethics and safety, including: * Text-to-Text Content Safety: Human evaluation on prompts covering safety policies including child sexual abuse and exploitation, harassment, violence and gore, and hate speech. * Text-to-Text Representational Harms: Benchmark against relevant academic datasets such as WinoBias and BBQ Dataset. * Memorization: Automated evaluation of memorization of training data, including the risk of personally identifiable information exposure. * Large-scale harm: Tests for \"dangerous capabilities,\" such as chemical, biological, radiological, and nuclear (CBRN) risks. ### Evaluation Results The results of ethics and safety evaluations are within acceptable thresholds for meeting internal policies for categories such as child safety, content safety, representational harms, memorization, large-scale harms. On top of robust internal evaluations, the results of well known safety benchmarks like BBQ, BOLD, Winogender, Winobias, RealToxicity, and TruthfulQA are shown here. | Benchmark | Metric | 2B Params | 7B Params | | ------------------------------ | ------------- | ----------- | --------- | | RealToxicity | average | 6.86 | 7.90 | | BOLD | | 45.57 | 49.08 | | CrowS-Pairs | top-1 | 45.82 | 51.33 | | BBQ Ambig | 1-shot, top-1 | 62.58 | 92.54 | | BBQ Disambig | top-1 | 54.62 | 71.99 | | Winogender | top-1 | 51.25 | 54.17 | | TruthfulQA | | 44.84 | 31.81 | | Winobias 1_2 | | 56.12 | 59.09 | | Winobias 2_2 | | 91.10 | 92.23 | | Toxigen | | 29.77 | 39.59 | | ------------------------------ | ------------- | ----------- | --------- | ## Usage and Limitations These models have certain limitations that users should be aware of. ### Intended Usage Open Large Language Models (LLMs) have a wide range of applications across various industries and domains. The following list of potential uses is not comprehensive. The purpose of this list is to provide contextual information about the possible use-cases that the model creators considered as part of model training and development. * Content Creation and Communication * Text Generation: These models can be used to generate creative text formats such as poems, scripts, code, marketing copy, and email drafts. * Chatbots and Conversational AI: Power conversational interfaces for customer service, virtual assistants, or interactive applications. * Text Summarization: Generate concise summaries of a text corpus, research papers, or reports. * Research and Education * Natural Language Processing (NLP) Research: These models can serve as a foundation for researchers to experiment with NLP techniques, develop algorithms, and contribute to the advancement of the field. * Language Learning Tools: Support interactive language learning experiences, aiding in grammar correction or providing writing practice. * Knowledge Exploration: Assist researchers in exploring large bodies of text by generating summaries or answering questions about specific topics. ### Limitations * Training Data * The quality and diversity of the training data significantly influence the model's capabilities. Biases or gaps in the training data can lead to limitations in the model's responses. * The scope of the training dataset determines the subject areas the model can handle effectively. * Context and Task Complexity * LLMs are better at tasks that can be framed with clear prompts and instructions. Open-ended or highly complex tasks might be challenging. * A model's performance can be influenced by the amount of context provided (longer context generally leads to better outputs, up to a certain point). * Language Ambiguity and Nuance * Natural language is inherently complex. LLMs might struggle to grasp subtle nuances, sarcasm, or figurative language. * Factual Accuracy * LLMs generate responses based on information they learned from their training datasets, but they are not knowledge bases. They may generate incorrect or outdated factual statements. * Common Sense * LLMs rely on statistical patterns in language. They might lack the ability to apply common sense reasoning in certain situations. ### Ethical Considerations and Risks The development of large language models (LLMs) raises several ethical concerns. In creating an open model, we have carefully considered the following: * Bias and Fairness * LLMs trained on large-scale, real-world text data can reflect socio-cultural biases embedded in the training material. These models underwent careful scrutiny, input data pre-processing described and posterior evaluations reported in this card. * Misinformation and Misuse * LLMs can be misused to generate text that is false, misleading, or harmful. * Guidelines are provided for responsible use with the model, see the Responsible Generative AI Toolkit. * Transparency and Accountability: * This model card summarizes details on the models' architecture, capabilities, limitations, and evaluation processes. * A responsibly developed open model offers the opportunity to share innovation by making LLM technology accessible to developers and researchers across the AI ecosystem. Risks identified and mitigations: * Perpetuation of biases: It's encouraged to perform continuous monitoring (using evaluation metrics, human review) and the exploration of de-biasing techniques during model training, fine-tuning, and other use cases. * Generation of harmful content: Mechanisms and guidelines for content safety are essential. Developers are encouraged to exercise caution and implement appropriate content safety safeguards based on their specific product policies and application use cases. * Misuse for malicious purposes: Technical limitations and developer and end-user education can help mitigate against malicious applications of LLMs. Educational resources and reporting mechanisms for users to flag misuse are provided. Prohibited uses of Gemma models are outlined in the Gemma Prohibited Use Policy. * Privacy violations: Models were trained on data filtered for removal of PII (Personally Identifiable Information). Developers are encouraged to adhere to privacy regulations with privacy-preserving techniques. ### Benefits At the time of release, this family of models provides high-performance open large language model implementations designed from the ground up for Responsible AI development compared to similarly sized models. Using the benchmark evaluation metrics described in this document, these models have shown to provide superior performance to other, comparably-sized open model alternatives.",
+ "model_explanation_gemini": "\"google_gemma-7b-it\" is a lightweight, instruction-tuned, decoder-only LLM for English text generation tasks like question answering, summarization, and reasoning, designed for efficient deployment on limited-resource environments. \n\n**Features**: \n- Text-to-text generation \n- Instruction-tuned variant (optimized for conversational use) \n- Supports fine-tuning (QLoRA, FSDP on TPU) \n- Compatible with CPU/GPU (multiple precisions: b"
+}
\ No newline at end of file
diff --git a/model_data_json/google_long-t5-tglobal-base.json b/model_data_json/google_long-t5-tglobal-base.json
new file mode 100644
index 0000000000000000000000000000000000000000..effc18a4482a634841c847832d388f583c31a5c5
--- /dev/null
+++ b/model_data_json/google_long-t5-tglobal-base.json
@@ -0,0 +1,21 @@
+{
+ "model_id": "google/long-t5-tglobal-base",
+ "downloads": 73842,
+ "tags": [
+ "transformers",
+ "pytorch",
+ "jax",
+ "longt5",
+ "text2text-generation",
+ "en",
+ "arxiv:2112.07916",
+ "arxiv:1912.08777",
+ "arxiv:1910.10683",
+ "license:apache-2.0",
+ "autotrain_compatible",
+ "endpoints_compatible",
+ "region:us"
+ ],
+ "description": "--- license: apache-2.0 language: en --- # LongT5 (transient-global attention, base-sized model) LongT5 model pre-trained on English language. The model was introduced in the paper LongT5: Efficient Text-To-Text Transformer for Long Sequences by Guo et al. and first released in the LongT5 repository. All the model architecture and configuration can be found in Flaxformer repository which uses another Google research project repository T5x. Disclaimer: The team releasing LongT5 did not write a model card for this model so this model card has been written by the Hugging Face team. ## Model description LongT5 model is an encoder-decoder transformer pre-trained in a text-to-text denoising generative setting (Pegasus-like generation pre-training). LongT5 model is an extension of T5 model, and it enables using one of the two different efficient attention mechanisms - (1) Local attention, or (2) Transient-Global attention. The usage of attention sparsity patterns allows the model to efficiently handle input sequence. LongT5 is particularly effective when fine-tuned for text generation (summarization, question answering) which requires handling long input sequences (up to 16,384 tokens). ## Intended uses & limitations The model is mostly meant to be fine-tuned on a supervised dataset. See the model hub to look for fine-tuned versions on a task that interests you. ### How to use ### BibTeX entry and citation info",
+ "model_explanation_gemini": "\"LongT5-tglobal-base is a text-to-text transformer model optimized for handling long sequences up to 16,384 tokens, using transient-global attention for efficient text generation tasks like summarization and question answering.\"\n\nFeatures: \n- Base-sized encoder-decoder transformer \n- Pre-trained with Pegasus-like denoising generation \n- Uses transient-global attention mechanism \n- Handles long input sequences (16K tokens) \n- Optimized for text generation tasks \n\nComparison: \nExtends"
+}
\ No newline at end of file
diff --git a/model_data_json/google_mobilenet_v2_1.0_224.json b/model_data_json/google_mobilenet_v2_1.0_224.json
new file mode 100644
index 0000000000000000000000000000000000000000..74e5ade6d5e25bd033e51e93191d35852292f21d
--- /dev/null
+++ b/model_data_json/google_mobilenet_v2_1.0_224.json
@@ -0,0 +1,20 @@
+{
+ "model_id": "google/mobilenet_v2_1.0_224",
+ "downloads": 72081,
+ "tags": [
+ "transformers",
+ "pytorch",
+ "safetensors",
+ "mobilenet_v2",
+ "image-classification",
+ "vision",
+ "dataset:imagenet-1k",
+ "arxiv:1801.04381",
+ "license:other",
+ "autotrain_compatible",
+ "endpoints_compatible",
+ "region:us"
+ ],
+ "description": "--- license: other tags: - vision - image-classification datasets: - imagenet-1k widget: - src: example_title: Tiger - src: example_title: Teapot - src: example_title: Palace --- # MobileNet V2 MobileNet V2 model pre-trained on ImageNet-1k at resolution 224x224. It was introduced in MobileNetV2: Inverted Residuals and Linear Bottlenecks by Mark Sandler, Andrew Howard, Menglong Zhu, Andrey Zhmoginov, Liang-Chieh Chen. It was first released in this repository. Disclaimer: The team releasing MobileNet V2 did not write a model card for this model so this model card has been written by the Hugging Face team. ## Model description From the original README: > MobileNets are small, low-latency, low-power models parameterized to meet the resource constraints of a variety of use cases. They can be built upon for classification, detection, embeddings and segmentation similar to how other popular large scale models, such as Inception, are used. MobileNets can be run efficiently on mobile devices [...] MobileNets trade off between latency, size and accuracy while comparing favorably with popular models from the literature. The checkpoints are named **mobilenet\\_v2\\_*depth*\\_*size***, for example **mobilenet\\_v2\\_1.0\\_224**, where **1.0** is the depth multiplier and **224** is the resolution of the input images the model was trained on. ## Intended uses & limitations You can use the raw model for image classification. See the model hub to look for fine-tuned versions on a task that interests you. ### How to use Here is how to use this model to classify an image of the COCO 2017 dataset into one of the 1,000 ImageNet classes: Note: This model actually predicts 1001 classes, the 1000 classes from ImageNet plus an extra “background” class (index 0). Currently, both the feature extractor and model support PyTorch. ### BibTeX entry and citation info",
+ "model_explanation_gemini": "\"MobileNet V2 is a lightweight vision model pre-trained on ImageNet-1k for efficient image classification at 224x224 resolution, optimized for mobile devices with trade-offs between latency, size, and accuracy.\"\n\nModel Features: \n- Architecture: MobileNet V2 (Inverted Residuals, Linear Bottlenecks) \n- Input Resolution: 224x224 \n- Pretraining Dataset: ImageNet-1k (1000 classes + 1 background class) \n- Use Case"
+}
\ No newline at end of file
diff --git a/model_data_json/google_paligemma-3b-mix-224.json b/model_data_json/google_paligemma-3b-mix-224.json
new file mode 100644
index 0000000000000000000000000000000000000000..215690166893e21fbef6d288d5f38c81636db1cf
--- /dev/null
+++ b/model_data_json/google_paligemma-3b-mix-224.json
@@ -0,0 +1,36 @@
+{
+ "model_id": "google/paligemma-3b-mix-224",
+ "downloads": 176245,
+ "tags": [
+ "transformers",
+ "safetensors",
+ "paligemma",
+ "image-text-to-text",
+ "arxiv:2310.09199",
+ "arxiv:2303.15343",
+ "arxiv:2403.08295",
+ "arxiv:1706.03762",
+ "arxiv:2010.11929",
+ "arxiv:2209.06794",
+ "arxiv:2209.04372",
+ "arxiv:2103.01913",
+ "arxiv:2205.12522",
+ "arxiv:2110.11624",
+ "arxiv:2108.03353",
+ "arxiv:2010.04295",
+ "arxiv:2401.06209",
+ "arxiv:2305.10355",
+ "arxiv:2203.10244",
+ "arxiv:1810.12440",
+ "arxiv:1905.13648",
+ "arxiv:1608.00272",
+ "arxiv:1908.04913",
+ "arxiv:2407.07726",
+ "license:gemma",
+ "text-generation-inference",
+ "endpoints_compatible",
+ "region:us"
+ ],
+ "description": "--- library_name: transformers license: gemma pipeline_tag: image-text-to-text extra_gated_heading: Access PaliGemma on Hugging Face extra_gated_prompt: To access PaliGemma on Hugging Face, you’re required to review and agree to Google’s usage license. To do this, please ensure you’re logged-in to Hugging Face and click below. Requests are processed immediately. extra_gated_button_content: Acknowledge license --- # PaliGemma model card **Model page:** PaliGemma Transformers PaliGemma 3B weights, fine-tuned with 224*224 input images and 256 token input/output text sequences on a mixture of downstream academic datasets. The models are available in float32, bfloat16 and float16 format for research purposes only. **Resources and technical documentation:** * Responsible Generative AI Toolkit * PaliGemma on Kaggle * PaliGemma on Vertex Model Garden **Terms of Use:** Terms **Authors:** Google ## Model information ### Model summary #### Description PaliGemma is a versatile and lightweight vision-language model (VLM) inspired by PaLI-3 and based on open components such as the SigLIP vision model and the Gemma language model. It takes both image and text as input and generates text as output, supporting multiple languages. It is designed for class-leading fine-tune performance on a wide range of vision-language tasks such as image and short video caption, visual question answering, text reading, object detection and object segmentation. #### Model architecture PaliGemma is the composition of a Transformer decoder and a Vision Transformer image encoder, with a total of 3 billion params. The text decoder is initialized from Gemma-2B. The image encoder is initialized from SigLIP-So400m/14. PaliGemma is trained following the PaLI-3 recipes. #### Inputs and outputs * **Input:** Image and text string, such as a prompt to caption the image, or a question. * **Output:** Generated text in response to the input, such as a caption of the image, an answer to a question, a list of object bounding box coordinates, or segmentation codewords. ### Model data #### Pre-train datasets PaliGemma is pre-trained on the following mixture of datasets: * **WebLI:** WebLI (Web Language Image) is a web-scale multilingual image-text dataset built from the public web. A wide range of WebLI splits are used to acquire versatile model capabilities, such as visual semantic understanding, object localization, visually-situated text understanding, multilinguality, etc. * **CC3M-35L:** Curated English image-alt_text pairs from webpages (Sharma et al., 2018). We used the Google Cloud Translation API to translate into 34 additional languages. * **VQ²A-CC3M-35L/VQG-CC3M-35L:** A subset of VQ2A-CC3M (Changpinyo et al., 2022a), translated into the same additional 34 languages as CC3M-35L, using the Google Cloud Translation API. * **OpenImages:** Detection and object-aware questions and answers (Piergiovanni et al. 2022) generated by handcrafted rules on the [OpenImages dataset]. * **WIT:** Images and texts collected from Wikipedia (Srinivasan et al., 2021). [OpenImages dataset]: #### Data responsibility filtering The following filters are applied to WebLI, with the goal of training PaliGemma on clean data: * **Pornographic image filtering:** This filter removes images deemed to be of pornographic nature. * **Text safety filtering:** We identify and filter out images that are paired with unsafe text. Unsafe text is any text deemed to contain or be about CSAI, pornography, vulgarities, or otherwise offensive. * **Text toxicity filtering:** We further use the Perspective API to identify and filter out images that are paired with text deemed insulting, obscene, hateful or otherwise toxic. * **Text personal information filtering:** We filtered certain personal information and other sensitive data using Cloud Data Loss Prevention (DLP) API to protect the privacy of individuals. Identifiers such as social security numbers and [other sensitive information types] were removed. * **Additional methods:** Filtering based on content quality and safety in line with our policies and practices. [other sensitive information types]: ## How to Use PaliGemma is a single-turn vision language model not meant for conversational use, and it works best when fine-tuning to a specific use case. You can configure which task the model will solve by conditioning it with task prefixes, such as “detect” or “segment”. The pretrained models were trained in this fashion to imbue them with a rich set of capabilities (question answering, captioning, segmentation, etc.). However, they are not designed to be used directly, but to be transferred (by fine-tuning) to specific tasks using a similar prompt structure. For interactive testing, you can use the \"mix\" family of models, which have been fine-tuned on a mixture of tasks. To see model google/paligemma-3b-mix-448 in action, check this Space that uses the Transformers codebase. Please, refer to the usage and limitations section for intended use cases, or visit the blog post for additional details and examples. ## Use in Transformers The following snippets use model for reference purposes. The model in this repo you are now browsing may have been trained for other tasks, please make sure you use appropriate inputs for the task at hand. ### Running the default precision () on CPU Output: ### Running other precisions on CUDA For convenience, the repos contain revisions of the weights already converted to and , so you can use them to reduce the download size and avoid casting on your local computer. This is how you'd run on an nvidia CUDA card. ### Loading in 4-bit / 8-bit You need to install to automatically run inference using 8-bit or 4-bit precision: ## Implementation information ### Hardware PaliGemma was trained using the latest generation of Tensor Processing Unit (TPU) hardware (TPUv5e). ### Software Training was done using JAX, Flax, TFDS and []( JAX allows researchers to take advantage of the latest generation of hardware, including TPUs, for faster and more efficient training of large models. TFDS is used to access datasets and Flax is used for model architecture. The PaliGemma fine-tune code and inference code are released in the GitHub repository. ## Evaluation information ### Benchmark results In order to verify the transferability of PaliGemma to a wide variety of academic tasks, we fine-tune the pretrained models on each task. Additionally we train the mix model with a mixture of the transfer tasks. We report results on different resolutions to provide an impression of which tasks benefit from increased resolution. Importantly, none of these tasks or datasets are part of the pretraining data mixture, and their images are explicitly removed from the web-scale pre-training data. #### Single task (fine-tune on single task)
## Ethics and safety ### Evaluation approach Our evaluation methods include structured evaluations and internal red-teaming testing of relevant content policies. Red-teaming was conducted by a number of different teams, each with different goals and human evaluation metrics. These models were evaluated against a number of different categories relevant to ethics and safety, including: * Human evaluation on prompts covering child safety, content safety and representational harms. See the Gemma model card for more details on evaluation approach, but with image captioning and visual question answering setups. * Image-to-Text benchmark evaluation: Benchmark against relevant academic datasets such as FairFace Dataset (Karkkainen et al., 2021). ### Evaluation results * The human evaluation results of ethics and safety evaluations are within acceptable thresholds for meeting internal policies for categories such as child safety, content safety and representational harms. * On top of robust internal evaluations, we also use the Perspective API (threshold of 0.8) to measure toxicity, profanity, and other potential issues in the generated captions for images sourced from the FairFace dataset. We report the maximum and median values observed across subgroups for each of the perceived gender, ethnicity, and age attributes.
## Quick Start The easiest way to starting using is to use Jina AI's Embedding API. ## Intended Usage & Model Info is an English, monolingual **embedding model** supporting **8192 sequence length**. It is based on a BERT architecture (JinaBERT) that supports the symmetric bidirectional variant of ALiBi to allow longer sequence length. The backbone is pretrained on the C4 dataset. The model is further trained on Jina AI's collection of more than 400 millions of sentence pairs and hard negatives. These pairs were obtained from various domains and were carefully selected through a thorough cleaning process. The embedding model was trained using 512 sequence length, but extrapolates to 8k sequence length (or even longer) thanks to ALiBi. This makes our model useful for a range of use cases, especially when processing long documents is needed, including long document retrieval, semantic textual similarity, text reranking, recommendation, RAG and LLM-based generative search, etc. This model has 33 million parameters, which enables lightning-fast and memory efficient inference, while still delivering impressive performance. Additionally, we provide the following embedding models: - []( 33 million parameters **(you are here)**. - []( 137 million parameters. - []( 161 million parameters Chinese-English Bilingual embeddings. - []( 161 million parameters German-English Bilingual embeddings. - [](): Spanish-English Bilingual embeddings (soon). ## Data & Parameters Jina Embeddings V2 technical report ## Usage **Please apply mean pooling when integrating the model.**
### Why mean pooling? takes all token embeddings from model output and averaging them at sentence/paragraph level. It has been proved to be the most effective way to produce high-quality sentence embeddings. We offer an function to deal with this. However, if you would like to do it without using the default function:
You can use Jina Embedding models directly from transformers package. If you only want to handle shorter sequence, such as 2k, pass the parameter to the function: The latest sentence-transformers also supports Jina embeddings: ## Alternatives to Using Transformers Package 1. _Managed SaaS_: Get started with a free key on Jina AI's Embedding API. 2. _Private and high-performance deployment_: Get started by picking from our suite of models and deploy them on AWS Sagemaker. ## RAG Performance According to the latest blog post from LLamaIndex, > In summary, to achieve the peak performance in both hit rate and MRR, the combination of OpenAI or JinaAI-Base embeddings with the CohereRerank/bge-reranker-large reranker stands out. ## Plans 1. Bilingual embedding models supporting more European & Asian languages, including Spanish, French, Italian and Japanese. 2. Multimodal embedding models enable Multimodal RAG applications. 3. High-performt rerankers. ## Trouble Shooting **Loading of Model Code failed** If you forgot to pass the flag when calling or initializing the model via the class, you will receive an error that the model weights could not be initialized. This is caused by tranformers falling back to creating a default BERT model, instead of a jina-embedding model: ## Contact Join our Discord community and chat with other community members about ideas. ## Citation If you find Jina Embeddings useful in your research, please cite the following paper:",
+ "model_explanation_gemini": "Generates sentence embeddings for tasks like similarity, classification, retrieval, and clustering in English.\n\nFeatures: \n- Sentence embedding generation \n- Supports similarity, classification, retrieval, and clustering tasks \n- Optimized for English language \n- Evaluated on MTEB benchmarks \n\nComparison: \nPerforms competitively on MTEB benchmarks for tasks like classification (e.g., 82.9% accuracy on AmazonPolarity) and retrieval (e.g., 39.18 MAP@1000 on Ar"
+}
\ No newline at end of file
diff --git a/model_data_json/ku-nlp_deberta-v2-tiny-japanese.json b/model_data_json/ku-nlp_deberta-v2-tiny-japanese.json
new file mode 100644
index 0000000000000000000000000000000000000000..32af3ef49c69525410db48f0521a899a48c60513
--- /dev/null
+++ b/model_data_json/ku-nlp_deberta-v2-tiny-japanese.json
@@ -0,0 +1,22 @@
+{
+ "model_id": "ku-nlp/deberta-v2-tiny-japanese",
+ "downloads": 80838,
+ "tags": [
+ "transformers",
+ "pytorch",
+ "safetensors",
+ "deberta-v2",
+ "fill-mask",
+ "deberta",
+ "ja",
+ "dataset:wikipedia",
+ "dataset:cc100",
+ "dataset:oscar",
+ "license:cc-by-sa-4.0",
+ "autotrain_compatible",
+ "endpoints_compatible",
+ "region:us"
+ ],
+ "description": "--- language: ja license: cc-by-sa-4.0 library_name: transformers tags: - deberta - deberta-v2 - fill-mask datasets: - wikipedia - cc100 - oscar metrics: - accuracy mask_token: \"[MASK]\" widget: - text: \"京都 大学 で 自然 言語 処理 を [MASK] する 。\" --- # Model Card for Japanese DeBERTa V2 tiny ## Model description This is a Japanese DeBERTa V2 tiny model pre-trained on Japanese Wikipedia, the Japanese portion of CC-100, and the Japanese portion of OSCAR. ## How to use You can use this model for masked language modeling as follows: You can also fine-tune this model on downstream tasks. ## Tokenization The input text should be segmented into words by Juman++ in advance. Juman++ 2.0.0-rc3 was used for pre-training. Each word is tokenized into subwords by sentencepiece. ## Training data We used the following corpora for pre-training: - Japanese Wikipedia (as of 20221020, 3.2GB, 27M sentences, 1.3M documents) - Japanese portion of CC-100 (85GB, 619M sentences, 66M documents) - Japanese portion of OSCAR (54GB, 326M sentences, 25M documents) Note that we filtered out documents annotated with \"header\", \"footer\", or \"noisy\" tags in OSCAR. Also note that Japanese Wikipedia was duplicated 10 times to make the total size of the corpus comparable to that of CC-100 and OSCAR. As a result, the total size of the training data is 171GB. ## Training procedure We first segmented texts in the corpora into words using Juman++. Then, we built a sentencepiece model with 32000 tokens including words (JumanDIC) and subwords induced by the unigram language model of sentencepiece. We tokenized the segmented corpora into subwords using the sentencepiece model and trained the Japanese DeBERTa model using transformers library. The training took 33 hours using 8 NVIDIA A100-SXM4-40GB GPUs. The following hyperparameters were used during pre-training: - learning_rate: 1e-3 - per_device_train_batch_size: 128 - distributed_type: multi-GPU - num_devices: 8 - gradient_accumulation_steps: 6 - total_train_batch_size: 6,144 - max_seq_length: 512 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-06 - lr_scheduler_type: linear schedule with warmup - training_steps: 100,000 - warmup_steps: 10,000 The accuracy of the trained model on the masked language modeling task was 0.593. The evaluation set consists of 5,000 randomly sampled documents from each of the training corpora. ## Acknowledgments This work was supported by Joint Usage/Research Center for Interdisciplinary Large-scale Information Infrastructures (JHPCN) through General Collaboration Project no. jh221004, \"Developing a Platform for Constructing and Sharing of Large-Scale Japanese Language Models\". For training models, we used the mdx: a platform for the data-driven future.",
+ "model_explanation_gemini": "A Japanese DeBERTa V2 tiny model pre-trained for masked language modeling on Japanese Wikipedia, CC-100, and OSCAR datasets, with input text requiring Juman++ word segmentation.\n\nFeatures: \n- Language: Japanese \n- Architecture: DeBERTa V2 tiny \n- Task: Masked language modeling \n- Training data: Japanese Wikipedia, CC-100, OSCAR (171GB total) \n- Tokenization: Juman++ word segmentation + SentencePiece subwords \n- License"
+}
\ No newline at end of file
diff --git a/model_data_json/laion_CLIP-ViT-L-14-laion2B-s32B-b82K.json b/model_data_json/laion_CLIP-ViT-L-14-laion2B-s32B-b82K.json
new file mode 100644
index 0000000000000000000000000000000000000000..a11aaba8a45d0e367153baa1739aa806af24b5af
--- /dev/null
+++ b/model_data_json/laion_CLIP-ViT-L-14-laion2B-s32B-b82K.json
@@ -0,0 +1,19 @@
+{
+ "model_id": "laion/CLIP-ViT-L-14-laion2B-s32B-b82K",
+ "downloads": 71845,
+ "tags": [
+ "open_clip",
+ "pytorch",
+ "tensorboard",
+ "safetensors",
+ "clip",
+ "zero-shot-image-classification",
+ "arxiv:2110.09456",
+ "arxiv:2111.09883",
+ "arxiv:1910.04867",
+ "license:mit",
+ "region:us"
+ ],
+ "description": "--- license: mit widget: - src: >- candidate_labels: playing music, playing sports example_title: Cat & Dog library_name: open_clip pipeline_tag: zero-shot-image-classification --- # Model Card for CLIP ViT-L/14 - LAION-2B # Table of Contents 1. Model Details 2. Uses 3. Training Details 4. Evaluation 5. Acknowledgements 6. Citation 7. How To Get Started With the Model # Model Details ## Model Description A CLIP ViT L/14 model trained with the LAION-2B English subset of LAION-5B ( using OpenCLIP ( Model training ('babysitting') done by Ross Wightman on the JUWELS Booster supercomputer. See acknowledgements below. # Uses As per the original OpenAI CLIP model card, this model is intended as a research output for research communities. We hope that this model will enable researchers to better understand and explore zero-shot, arbitrary image classification. We also hope it can be used for interdisciplinary studies of the potential impact of such model. The OpenAI CLIP paper includes a discussion of potential downstream impacts to provide an example for this sort of analysis. Additionally, the LAION-5B blog ( and upcoming paper include additional discussion as it relates specifically to the training dataset. ## Direct Use Zero-shot image classification, image and text retrieval, among others. ## Downstream Use Image classification and other image task fine-tuning, linear probe image classification, image generation guiding and conditioning, among others. ## Out-of-Scope Use As per the OpenAI models, **Any** deployed use case of the model - whether commercial or not - is currently out of scope. Non-deployed use cases such as image search in a constrained environment, are also not recommended unless there is thorough in-domain testing of the model with a specific, fixed class taxonomy. This is because our safety assessment demonstrated a high need for task specific testing especially given the variability of CLIP’s performance with different class taxonomies. This makes untested and unconstrained deployment of the model in any use case currently potentially harmful. Certain use cases which would fall under the domain of surveillance and facial recognition are always out-of-scope regardless of performance of the model. This is because the use of artificial intelligence for tasks such as these can be premature currently given the lack of testing norms and checks to ensure its fair use. Since the model has not been purposefully trained in or evaluated on any languages other than English, its use should be limited to English language use cases. Further the above notice, the LAION-5B dataset used in training of these models has additional considerations, see below. # Training Details ## Training Data This model was trained with the 2 Billion sample English subset of LAION-5B ( **IMPORTANT NOTE:** The motivation behind dataset creation is to democratize research and experimentation around large-scale multi-modal model training and handling of uncurated, large-scale datasets crawled from publically available internet. Our recommendation is therefore to use the dataset for research purposes. Be aware that this large-scale dataset is uncurated. Keep in mind that the uncurated nature of the dataset means that collected links may lead to strongly discomforting and disturbing content for a human viewer. Therefore, please use the demo links with caution and at your own risk. It is possible to extract a “safe” subset by filtering out samples based on the safety tags (using a customized trained NSFW classifier that we built). While this strongly reduces the chance for encountering potentially harmful content when viewing, we cannot entirely exclude the possibility for harmful content being still present in safe mode, so that the warning holds also there. We think that providing the dataset openly to broad research and other interested communities will allow for transparent investigation of benefits that come along with training large-scale models as well as pitfalls and dangers that may stay unreported or unnoticed when working with closed large datasets that remain restricted to a small community. Providing our dataset openly, we however do not recommend using it for creating ready-to-go industrial products, as the basic research about general properties and safety of such large-scale models, which we would like to encourage with this release, is still in progress. ## Training Procedure The model was trained on 384 A100 GPUs using 200M sample 'virtual' epochs where dataset shards were sampled with replacement. The model was trained with 160 virtual epochs for a total of 32B samples seen. The first 68 epochs were trained with float16 AMP, global batch size 79K (208 per GPU). Initially running to epoch 75, where the loss spiked and training failed with NaN. Romain Beaumont was training H/14 and g/14 models at the same time on Stability cluster and hit similar instabilities. Collectively we tried restarts with, * different dataset shuffle seed * different LR * gradient clipping * modifications to the architecture * Norm modifications (stable norm for final, post embed norm for text transformer) as per thanks to Phil Wang * Extra attention block norms ala Normformer ( * Scaled cosine attention ala Swin-V2 ( None of the above ended up working. Most blew up within the same epoch as original, with the exception of architecture mods. * Normformer mods signifcantly altered the network such that resuming did not quickly converge to previous performance, this was abandoned but might be worth trying from start. * Scaled cosine attn initially looked promising and lasted until epoch 90 before loss suddenly increased and appeared to remain 'stuck'. In the end, restarting at epoch 69 with precision solved all instabilities and training continued from there with global batch size 86k (224 per GPU). On A100 GPUs, had a minimal impact on the throughput once matmuls were enabled in PyTorch. Approximately 10% slower than . Romain similary changed the precision but ended up using to resolve issues. ### Slum Script # Evaluation Evaluation done with code in the LAION CLIP Benchmark suite. ## Testing Data, Factors & Metrics ### Testing Data The testing is performed with VTAB+ (A combination of VTAB ( w/ additional robustness datasets) for classification and COCO and Flickr for retrieval. **TODO** - more detail ## Results The model achieves a 75.3 zero-shot top-1 accuracy on ImageNet-1k. An initial round of benchmarks have been performed on a wider range of datasets, currently viewable at **TODO** - create table for just this model's metrics. # Acknowledgements Acknowledging the Gauss Centre for Supercomputing e.V. ( for funding this part of work by providing computing time through the John von Neumann Institute for Computing (NIC) on the GCS Supercomputer JUWELS Booster at Jülich Supercomputing Centre (JSC). # Citation **BibTeX:** LAION-5B OpenAI CLIP paper OpenCLIP software # How to Get Started with the Model Use the code below to get started with the model. ** TODO ** - Hugging Face transformers, OpenCLIP, and timm getting started snippets",
+ "model_explanation_gemini": "Performs zero-shot image classification and text-image retrieval by matching images with text prompts without task-specific training.\n\n**Model Features:**\n- Architecture: CLIP ViT-L/14 (Vision Transformer, Large/14 patch size)\n- Training Data: LAION-2B (English subset of LAION-5B)\n- Capabilities: Zero-shot image classification, image/text retrieval, fine-tuning for downstream tasks\n- Training: Done via OpenCLIP on JUWELS Booster supercomputer"
+}
\ No newline at end of file
diff --git a/model_data_json/laion_CLIP-convnext_large_d_320.laion2B-s29B-b131K-ft-soup.json b/model_data_json/laion_CLIP-convnext_large_d_320.laion2B-s29B-b131K-ft-soup.json
new file mode 100644
index 0000000000000000000000000000000000000000..fed4e91b0d043a2714cb8a85fdbfdf4fc7b07fb7
--- /dev/null
+++ b/model_data_json/laion_CLIP-convnext_large_d_320.laion2B-s29B-b131K-ft-soup.json
@@ -0,0 +1,18 @@
+{
+ "model_id": "laion/CLIP-convnext_large_d_320.laion2B-s29B-b131K-ft-soup",
+ "downloads": 81153,
+ "tags": [
+ "open_clip",
+ "tensorboard",
+ "safetensors",
+ "zero-shot-image-classification",
+ "clip",
+ "arxiv:2201.03545",
+ "arxiv:2210.08402",
+ "arxiv:1910.04867",
+ "license:mit",
+ "region:us"
+ ],
+ "description": "--- tags: - zero-shot-image-classification - clip license: mit library_name: open_clip pipeline_tag: zero-shot-image-classification --- # Model card for CLIP-convnext_large_d_320.laion2B-s29B-b131K-ft-soup # Table of Contents 1. Model Details 2. Uses 3. Training Details 4. Evaluation 5. Acknowledgements 6. Citation # Model Details ## Model Description A series of CLIP ConvNeXt-Large (w/ extra text depth, vision MLP head) models trained on the LAION-2B (english) subset of LAION-5B using OpenCLIP. The models utilize: * the timm ConvNeXt-Large model () as the image tower * a MLP () head in vision tower instead of the single projection of other CLIP models * a text tower with same width but 4 layers more depth than ViT-L / RN50x16 models (depth 16, embed dim 768). This 320x320 resolution model is a soup (weight average) of 3 fine-tunes of CLIP-convnext_large_d.laion2B-s26B-b102K-augreg at a higher resolution. It is an average of 3 fine-tunes from the final checkpoint of the original 256x256 training run w/ an additional ~2-3B samples for each fine-tune and a lower learning rate. Each fine-tune was a different learning rate (1e-4, 6e-5, 5e-5), and diff # of samples (3.2B, 2B, 2.5B). At 320x320, the ConvNext-Large-D is significantly more efficient than the L/14 model at 336x336 that OpenAI fine-tuned. L/14-336 model is 2.5x more GMAC, 2.8x more activations, and 1.22x more parameters. | Model | Dataset | Resolution | AugReg | Top-1 ImageNet Zero-Shot (%) | | ----- | ------- | ---------- | ------------ | --------- | | convnext_large_d.laion2b_s26b_b102k-augreg | LAION-2B | 256x256 | RRC (0.33, 1.0), RE (0.35), SD (0.1), D(0.1) | 75.9 | | convnext_large_d_320.laion2b_s29b_b131k-ft | LAION-2B | 320x320 | RRC (0.5, 1.0), RE (0.4), SD (0.1), D(0.0) | 76.6 | | convnext_large_d_320.laion2b_s29b_b131k-ft-soup | LAION-2B | 320x320 | RRC (0.5, 1.0), RE (0.4), SD (0.1), D(0.0) | 76.9 | RRC = Random Resize Crop (crop pcts), RE = Random Erasing (prob), SD = Stochastic Depth (prob) -- image tower only, D = Dropout (prob) -- image tower head only LAION-A = LAION Aesthetic, an ~900M sample subset of LAION-2B with pHash dedupe and asthetic score filtering. Model training done by Ross Wightman on the stability.ai cluster. # Uses As per the original OpenAI CLIP model card, this model is intended as a research output for research communities. We hope that this model will enable researchers to better understand and explore zero-shot, arbitrary image classification. We also hope it can be used for interdisciplinary studies of the potential impact of such model. The OpenAI CLIP paper includes a discussion of potential downstream impacts to provide an example for this sort of analysis. Additionally, the LAION-5B blog ( and upcoming paper include additional discussion as it relates specifically to the training dataset. ## Direct Use Zero-shot image classification, image and text retrieval, among others. ## Downstream Use Image classification and other image task fine-tuning, linear probe image classification, image generation guiding and conditioning, among others. ## Out-of-Scope Use As per the OpenAI models, **Any** deployed use case of the model - whether commercial or not - is currently out of scope. Non-deployed use cases such as image search in a constrained environment, are also not recommended unless there is thorough in-domain testing of the model with a specific, fixed class taxonomy. This is because our safety assessment demonstrated a high need for task specific testing especially given the variability of CLIP’s performance with different class taxonomies. This makes untested and unconstrained deployment of the model in any use case currently potentially harmful. Certain use cases which would fall under the domain of surveillance and facial recognition are always out-of-scope regardless of performance of the model. This is because the use of artificial intelligence for tasks such as these can be premature currently given the lack of testing norms and checks to ensure its fair use. Since the model has not been purposefully trained in or evaluated on any languages other than English, its use should be limited to English language use cases. Further the above notice, the LAION-5B dataset used in training of these models has additional considerations, see below. # Training Details ## Training Data This model was trained with LAION-2B -- A 2 billion sample English subset of LAION-5B ( **IMPORTANT NOTE:** The motivation behind dataset creation is to democratize research and experimentation around large-scale multi-modal model training and handling of uncurated, large-scale datasets crawled from publically available internet. Our recommendation is therefore to use the dataset for research purposes. Be aware that this large-scale dataset is uncurated. Keep in mind that the uncurated nature of the dataset means that collected links may lead to strongly discomforting and disturbing content for a human viewer. Therefore, please use the demo links with caution and at your own risk. It is possible to extract a “safe” subset by filtering out samples based on the safety tags (using a customized trained NSFW classifier that we built). While this strongly reduces the chance for encountering potentially harmful content when viewing, we cannot entirely exclude the possibility for harmful content being still present in safe mode, so that the warning holds also there. We think that providing the dataset openly to broad research and other interested communities will allow for transparent investigation of benefits that come along with training large-scale models as well as pitfalls and dangers that may stay unreported or unnoticed when working with closed large datasets that remain restricted to a small community. Providing our dataset openly, we however do not recommend using it for creating ready-to-go industrial products, as the basic research about general properties and safety of such large-scale models, which we would like to encourage with this release, is still in progress. ## Training Procedure All 320x320 model fine-tunes were trained with a global batch size of 131072 for 10-16 checkpoint intervals of 203.7M samples for a total of ~2-3B samples seen over fine-tune. For 320x320 models, a slurm script w/ srun below was used on 64 8-GPU (A100 40GB) nodes (Stability). # Evaluation Evaluation done with code in the LAION CLIP Benchmark suite. ## Testing Data, Factors & Metrics ### Testing Data The testing is performed with VTAB+ (A combination of VTAB ( w/ additional robustness datasets) for classification and COCO and Flickr for retrieval. ## Results The models achieve between 75.9 and 76.9 top-1 zero-shot accuracy on ImageNet-1k. Zero-shot curve of origina from-scratch 256x256 training: An initial round of benchmarks have been performed on a wider range of datasets, to be viewable at # Acknowledgements Acknowledging stability.ai for compute used to train this model. # Citation **BibTeX:** LAION-5B OpenCLIP software OpenAI CLIP paper",
+ "model_explanation_gemini": "A CLIP-based model for zero-shot image classification and retrieval, using a ConvNeXt-Large architecture with enhanced vision and text towers, trained on LAION-2B and fine-tuned at 320x320 resolution for improved efficiency and performance. \n\n**Features:** \n- **Architecture:** ConvNeXt-Large (vision tower) with MLP head, deeper text tower (16 layers, 768 embed dim). \n- **Training Data:** LAION-2B (English"
+}
\ No newline at end of file
diff --git a/model_data_json/laion_clap-htsat-fused.json b/model_data_json/laion_clap-htsat-fused.json
new file mode 100644
index 0000000000000000000000000000000000000000..8267f50cf1cfc4098c9660195c1c4852ed4c2f70
--- /dev/null
+++ b/model_data_json/laion_clap-htsat-fused.json
@@ -0,0 +1,17 @@
+{
+ "model_id": "laion/clap-htsat-fused",
+ "downloads": 81868,
+ "tags": [
+ "transformers",
+ "pytorch",
+ "safetensors",
+ "clap",
+ "feature-extraction",
+ "arxiv:2211.06687",
+ "license:apache-2.0",
+ "endpoints_compatible",
+ "region:us"
+ ],
+ "description": "--- license: apache-2.0 --- # Model card for CLAP Model card for CLAP: Contrastive Language-Audio Pretraining !clap_image # Table of Contents 0. TL;DR 1. Model Details 2. Usage 3. Uses 4. Citation # TL;DR The abstract of the paper states that: > Contrastive learning has shown remarkable success in the field of multimodal representation learning. In this paper, we propose a pipeline of contrastive language-audio pretraining to develop an audio representation by combining audio data with natural language descriptions. To accomplish this target, we first release LAION-Audio-630K, a large collection of 633,526 audio-text pairs from different data sources. Second, we construct a contrastive language-audio pretraining model by considering different audio encoders and text encoders. We incorporate the feature fusion mechanism and keyword-to-caption augmentation into the model design to further enable the model to process audio inputs of variable lengths and enhance the performance. Third, we perform comprehensive experiments to evaluate our model across three tasks: text-to-audio retrieval, zero-shot audio classification, and supervised audio classification. The results demonstrate that our model achieves superior performance in text-to-audio retrieval task. In audio classification tasks, the model achieves state-of-the-art performance in the zero-shot setting and is able to obtain performance comparable to models' results in the non-zero-shot setting. LAION-Audio-630K and the proposed model are both available to the public. # Usage You can use this model for zero shot audio classification or extracting audio and/or textual features. # Uses ## Perform zero-shot audio classification ### Using ## Run the model: You can also get the audio and text embeddings using ### Run the model on CPU: ### Run the model on GPU: # Citation If you are using this model for your work, please consider citing the original paper:",
+ "model_explanation_gemini": "\"Contrastive Language-Audio Pretraining (CLAP) model for text-to-audio retrieval, zero-shot audio classification, and supervised audio classification, leveraging fused audio-text features and variable-length input processing.\"\n\n**Features**: \n1. Processes audio-text pairs for multimodal representation learning. \n2. Supports variable-length audio inputs via feature fusion. \n3. Enables zero-shot audio classification and text-to-audio retrieval. \n4. Outperforms others in retrieval and zero-shot tasks, matching"
+}
\ No newline at end of file
diff --git a/model_data_json/laion_clap-htsat-unfused.json b/model_data_json/laion_clap-htsat-unfused.json
new file mode 100644
index 0000000000000000000000000000000000000000..937aa81811f1028c3eccd2b830199967007e62cc
--- /dev/null
+++ b/model_data_json/laion_clap-htsat-unfused.json
@@ -0,0 +1,16 @@
+{
+ "model_id": "laion/clap-htsat-unfused",
+ "downloads": 72130,
+ "tags": [
+ "transformers",
+ "pytorch",
+ "clap",
+ "feature-extraction",
+ "arxiv:2211.06687",
+ "license:apache-2.0",
+ "endpoints_compatible",
+ "region:us"
+ ],
+ "description": "--- license: apache-2.0 --- # Model card for CLAP Model card for CLAP: Contrastive Language-Audio Pretraining !clap_image # Table of Contents 0. TL;DR 1. Model Details 2. Usage 3. Uses 4. Citation # TL;DR The abstract of the paper states that: > Contrastive learning has shown remarkable success in the field of multimodal representation learning. In this paper, we propose a pipeline of contrastive language-audio pretraining to develop an audio representation by combining audio data with natural language descriptions. To accomplish this target, we first release LAION-Audio-630K, a large collection of 633,526 audio-text pairs from different data sources. Second, we construct a contrastive language-audio pretraining model by considering different audio encoders and text encoders. We incorporate the feature fusion mechanism and keyword-to-caption augmentation into the model design to further enable the model to process audio inputs of variable lengths and enhance the performance. Third, we perform comprehensive experiments to evaluate our model across three tasks: text-to-audio retrieval, zero-shot audio classification, and supervised audio classification. The results demonstrate that our model achieves superior performance in text-to-audio retrieval task. In audio classification tasks, the model achieves state-of-the-art performance in the zero-shot setting and is able to obtain performance comparable to models' results in the non-zero-shot setting. LAION-Audio-630K and the proposed model are both available to the public. # Usage You can use this model for zero shot audio classification or extracting audio and/or textual features. # Uses ## Perform zero-shot audio classification ### Using ## Run the model: You can also get the audio and text embeddings using ### Run the model on CPU: ### Run the model on GPU: # Citation If you are using this model for your work, please consider citing the original paper:",
+ "model_explanation_gemini": "\"Contrastive Language-Audio Pretraining (CLAP) model for text-to-audio retrieval, zero-shot audio classification, and supervised audio classification, trained on 633,526 audio-text pairs with feature fusion and keyword augmentation.\"\n\nFeatures: \n- Processes variable-length audio inputs \n- Generates audio/text embeddings \n- Supports zero-shot classification \n- Compatible with CPU/GPU inference \n\nComparison: \nOutperforms other models in text-to-audio retrieval and achieves state-of-the-art zero"
+}
\ No newline at end of file
diff --git a/model_data_json/llm-book_bert-base-japanese-v3-ner-wikipedia-dataset.json b/model_data_json/llm-book_bert-base-japanese-v3-ner-wikipedia-dataset.json
new file mode 100644
index 0000000000000000000000000000000000000000..ce1e3c9752f83a63f0d541846b1851c35f89ec60
--- /dev/null
+++ b/model_data_json/llm-book_bert-base-japanese-v3-ner-wikipedia-dataset.json
@@ -0,0 +1,18 @@
+{
+ "model_id": "llm-book/bert-base-japanese-v3-ner-wikipedia-dataset",
+ "downloads": 78162,
+ "tags": [
+ "transformers",
+ "pytorch",
+ "bert",
+ "token-classification",
+ "ja",
+ "dataset:llm-book/ner-wikipedia-dataset",
+ "license:apache-2.0",
+ "autotrain_compatible",
+ "endpoints_compatible",
+ "region:us"
+ ],
+ "description": "--- language: - ja license: apache-2.0 library_name: transformers datasets: - llm-book/ner-wikipedia-dataset pipeline_tag: token-classification metrics: - seqeval - precision - recall - f1 --- # llm-book/bert-base-japanese-v3-ner-wikipedia-dataset 「大規模言語モデル入門」の第6章で紹介している固有表現認識のモデルです。 cl-tohoku/bert-base-japanese-v3をllm-book/ner-wikipedia-datasetでファインチューニングして構築されています。 ## 関連リンク * GitHubリポジトリ * Colabノートブック * データセット * 大規模言語モデル入門(Amazon.co.jp) * 大規模言語モデル入門(gihyo.jp) ## 使い方 ## ライセンス Apache License 2.0",
+ "model_explanation_gemini": "Performs Japanese named entity recognition (NER) by fine-tuning 'bert-base-japanese-v3' on a Wikipedia-derived dataset.\n\nModel Features: \n- Language: Japanese \n- Base Model: cl-tohoku/bert-base-japanese-v3 \n- Fine-tuning Dataset: llm-book/ner-wikipedia-dataset \n- Task: Token-classification (NER) \n- Metrics: seqeval, precision, recall, f1 \n- License: Apache 2.0 \n\nComparison:"
+}
\ No newline at end of file
diff --git a/model_data_json/llmrails_ember-v1.json b/model_data_json/llmrails_ember-v1.json
new file mode 100644
index 0000000000000000000000000000000000000000..9f53718bc066d95e811b44cbb62b226441172e81
--- /dev/null
+++ b/model_data_json/llmrails_ember-v1.json
@@ -0,0 +1,26 @@
+{
+ "model_id": "llmrails/ember-v1",
+ "downloads": 71420,
+ "tags": [
+ "sentence-transformers",
+ "pytorch",
+ "safetensors",
+ "bert",
+ "feature-extraction",
+ "mteb",
+ "sentence-similarity",
+ "transformers",
+ "en",
+ "arxiv:2205.12035",
+ "arxiv:2209.11055",
+ "doi:10.57967/hf/2919",
+ "license:mit",
+ "model-index",
+ "autotrain_compatible",
+ "text-embeddings-inference",
+ "endpoints_compatible",
+ "region:us"
+ ],
+ "description": "--- tags: - mteb - sentence-transformers - feature-extraction - sentence-similarity - transformers language: en license: mit model-index: - name: ember_v1 results: - task: type: Classification dataset: type: mteb/amazon_counterfactual name: MTEB AmazonCounterfactualClassification (en) config: en split: test revision: e8379541af4e31359cca9fbcf4b00f2671dba205 metrics: - type: accuracy value: 76.05970149253731 - type: ap value: 38.76045348512767 - type: f1 value: 69.8824007294685 - task: type: Classification dataset: type: mteb/amazon_polarity name: MTEB AmazonPolarityClassification config: default split: test revision: e2d317d38cd51312af73b3d32a06d1a08b442046 metrics: - type: accuracy value: 91.977 - type: ap value: 88.63507587170176 - type: f1 value: 91.9524133311038 - task: type: Classification dataset: type: mteb/amazon_reviews_multi name: MTEB AmazonReviewsClassification (en) config: en split: test revision: 1399c76144fd37290681b995c656ef9b2e06e26d metrics: - type: accuracy value: 47.938 - type: f1 value: 47.58273047536129 - task: type: Retrieval dataset: type: arguana name: MTEB ArguAna config: default split: test revision: None metrics: - type: map_at_1 value: 41.252 - type: map_at_10 value: 56.567 - type: map_at_100 value: 57.07600000000001 - type: map_at_1000 value: 57.08 - type: map_at_3 value: 52.394 - type: map_at_5 value: 55.055 - type: mrr_at_1 value: 42.39 - type: mrr_at_10 value: 57.001999999999995 - type: mrr_at_100 value: 57.531 - type: mrr_at_1000 value: 57.535000000000004 - type: mrr_at_3 value: 52.845 - type: mrr_at_5 value: 55.47299999999999 - type: ndcg_at_1 value: 41.252 - type: ndcg_at_10 value: 64.563 - type: ndcg_at_100 value: 66.667 - type: ndcg_at_1000 value: 66.77 - type: ndcg_at_3 value: 56.120000000000005 - type: ndcg_at_5 value: 60.889 - type: precision_at_1 value: 41.252 - type: precision_at_10 value: 8.982999999999999 - type: precision_at_100 value: 0.989 - type: precision_at_1000 value: 0.1 - type: precision_at_3 value: 22.309 - type: precision_at_5 value: 15.690000000000001 - type: recall_at_1 value: 41.252 - type: recall_at_10 value: 89.82900000000001 - type: recall_at_100 value: 98.86200000000001 - type: recall_at_1000 value: 99.644 - type: recall_at_3 value: 66.927 - type: recall_at_5 value: 78.45 - task: type: Clustering dataset: type: mteb/arxiv-clustering-p2p name: MTEB ArxivClusteringP2P config: default split: test revision: a122ad7f3f0291bf49cc6f4d32aa80929df69d5d metrics: - type: v_measure value: 48.5799968717232 - task: type: Clustering dataset: type: mteb/arxiv-clustering-s2s name: MTEB ArxivClusteringS2S config: default split: test revision: f910caf1a6075f7329cdf8c1a6135696f37dbd53 metrics: - type: v_measure value: 43.142844164856136 - task: type: Reranking dataset: type: mteb/askubuntudupquestions-reranking name: MTEB AskUbuntuDupQuestions config: default split: test revision: 2000358ca161889fa9c082cb41daa8dcfb161a54 metrics: - type: map value: 64.45997990276463 - type: mrr value: 77.85560392208592 - task: type: STS dataset: type: mteb/biosses-sts name: MTEB BIOSSES config: default split: test revision: d3fb88f8f02e40887cd149695127462bbcf29b4a metrics: - type: cos_sim_pearson value: 86.38299310075898 - type: cos_sim_spearman value: 85.81038898286454 - type: euclidean_pearson value: 84.28002556389774 - type: euclidean_spearman value: 85.80315990248238 - type: manhattan_pearson value: 83.9755390675032 - type: manhattan_spearman value: 85.30435335611396 - task: type: Classification dataset: type: mteb/banking77 name: MTEB Banking77Classification config: default split: test revision: 0fd18e25b25c072e09e0d92ab615fda904d66300 metrics: - type: accuracy value: 87.89935064935065 - type: f1 value: 87.87886687103833 - task: type: Clustering dataset: type: mteb/biorxiv-clustering-p2p name: MTEB BiorxivClusteringP2P config: default split: test revision: 65b79d1d13f80053f67aca9498d9402c2d9f1f40 metrics: - type: v_measure value: 38.84335510371379 - task: type: Clustering dataset: type: mteb/biorxiv-clustering-s2s name: MTEB BiorxivClusteringS2S config: default split: test revision: 258694dd0231531bc1fd9de6ceb52a0853c6d908 metrics: - type: v_measure value: 36.377963093857005 - task: type: Retrieval dataset: type: BeIR/cqadupstack name: MTEB CQADupstackAndroidRetrieval config: default split: test revision: None metrics: - type: map_at_1 value: 32.557 - type: map_at_10 value: 44.501000000000005 - type: map_at_100 value: 46.11 - type: map_at_1000 value: 46.232 - type: map_at_3 value: 40.711000000000006 - type: map_at_5 value: 42.937 - type: mrr_at_1 value: 40.916000000000004 - type: mrr_at_10 value: 51.317 - type: mrr_at_100 value: 52.003 - type: mrr_at_1000 value: 52.044999999999995 - type: mrr_at_3 value: 48.569 - type: mrr_at_5 value: 50.322 - type: ndcg_at_1 value: 40.916000000000004 - type: ndcg_at_10 value: 51.353 - type: ndcg_at_100 value: 56.762 - type: ndcg_at_1000 value: 58.555 - type: ndcg_at_3 value: 46.064 - type: ndcg_at_5 value: 48.677 - type: precision_at_1 value: 40.916000000000004 - type: precision_at_10 value: 9.927999999999999 - type: precision_at_100 value: 1.592 - type: precision_at_1000 value: 0.20600000000000002 - type: precision_at_3 value: 22.078999999999997 - type: precision_at_5 value: 16.08 - type: recall_at_1 value: 32.557 - type: recall_at_10 value: 63.942 - type: recall_at_100 value: 86.436 - type: recall_at_1000 value: 97.547 - type: recall_at_3 value: 48.367 - type: recall_at_5 value: 55.818 - task: type: Retrieval dataset: type: BeIR/cqadupstack name: MTEB CQADupstackEnglishRetrieval config: default split: test revision: None metrics: - type: map_at_1 value: 32.106 - type: map_at_10 value: 42.55 - type: map_at_100 value: 43.818 - type: map_at_1000 value: 43.952999999999996 - type: map_at_3 value: 39.421 - type: map_at_5 value: 41.276 - type: mrr_at_1 value: 39.936 - type: mrr_at_10 value: 48.484 - type: mrr_at_100 value: 49.123 - type: mrr_at_1000 value: 49.163000000000004 - type: mrr_at_3 value: 46.221000000000004 - type: mrr_at_5 value: 47.603 - type: ndcg_at_1 value: 39.936 - type: ndcg_at_10 value: 48.25 - type: ndcg_at_100 value: 52.674 - type: ndcg_at_1000 value: 54.638 - type: ndcg_at_3 value: 44.05 - type: ndcg_at_5 value: 46.125 - type: precision_at_1 value: 39.936 - type: precision_at_10 value: 9.096 - type: precision_at_100 value: 1.473 - type: precision_at_1000 value: 0.19499999999999998 - type: precision_at_3 value: 21.295 - type: precision_at_5 value: 15.121 - type: recall_at_1 value: 32.106 - type: recall_at_10 value: 58.107 - type: recall_at_100 value: 76.873 - type: recall_at_1000 value: 89.079 - type: recall_at_3 value: 45.505 - type: recall_at_5 value: 51.479 - task: type: Retrieval dataset: type: BeIR/cqadupstack name: MTEB CQADupstackGamingRetrieval config: default split: test revision: None metrics: - type: map_at_1 value: 41.513 - type: map_at_10 value: 54.571999999999996 - type: map_at_100 value: 55.579 - type: map_at_1000 value: 55.626 - type: map_at_3 value: 51.127 - type: map_at_5 value: 53.151 - type: mrr_at_1 value: 47.398 - type: mrr_at_10 value: 57.82000000000001 - type: mrr_at_100 value: 58.457 - type: mrr_at_1000 value: 58.479000000000006 - type: mrr_at_3 value: 55.32899999999999 - type: mrr_at_5 value: 56.89999999999999 - type: ndcg_at_1 value: 47.398 - type: ndcg_at_10 value: 60.599000000000004 - type: ndcg_at_100 value: 64.366 - type: ndcg_at_1000 value: 65.333 - type: ndcg_at_3 value: 54.98 - type: ndcg_at_5 value: 57.874 - type: precision_at_1 value: 47.398 - type: precision_at_10 value: 9.806 - type: precision_at_100 value: 1.2590000000000001 - type: precision_at_1000 value: 0.13799999999999998 - type: precision_at_3 value: 24.619 - type: precision_at_5 value: 16.878 - type: recall_at_1 value: 41.513 - type: recall_at_10 value: 74.91799999999999 - type: recall_at_100 value: 90.96 - type: recall_at_1000 value: 97.923 - type: recall_at_3 value: 60.013000000000005 - type: recall_at_5 value: 67.245 - task: type: Retrieval dataset: type: BeIR/cqadupstack name: MTEB CQADupstackGisRetrieval config: default split: test revision: None metrics: - type: map_at_1 value: 26.319 - type: map_at_10 value: 35.766999999999996 - type: map_at_100 value: 36.765 - type: map_at_1000 value: 36.829 - type: map_at_3 value: 32.888 - type: map_at_5 value: 34.538999999999994 - type: mrr_at_1 value: 28.249000000000002 - type: mrr_at_10 value: 37.766 - type: mrr_at_100 value: 38.62 - type: mrr_at_1000 value: 38.667 - type: mrr_at_3 value: 35.009 - type: mrr_at_5 value: 36.608000000000004 - type: ndcg_at_1 value: 28.249000000000002 - type: ndcg_at_10 value: 41.215 - type: ndcg_at_100 value: 46.274 - type: ndcg_at_1000 value: 48.007 - type: ndcg_at_3 value: 35.557 - type: ndcg_at_5 value: 38.344 - type: precision_at_1 value: 28.249000000000002 - type: precision_at_10 value: 6.429 - type: precision_at_100 value: 0.9480000000000001 - type: precision_at_1000 value: 0.11399999999999999 - type: precision_at_3 value: 15.179 - type: precision_at_5 value: 10.734 - type: recall_at_1 value: 26.319 - type: recall_at_10 value: 56.157999999999994 - type: recall_at_100 value: 79.65 - type: recall_at_1000 value: 92.73 - type: recall_at_3 value: 40.738 - type: recall_at_5 value: 47.418 - task: type: Retrieval dataset: type: BeIR/cqadupstack name: MTEB CQADupstackMathematicaRetrieval config: default split: test revision: None metrics: - type: map_at_1 value: 18.485 - type: map_at_10 value: 27.400999999999996 - type: map_at_100 value: 28.665000000000003 - type: map_at_1000 value: 28.79 - type: map_at_3 value: 24.634 - type: map_at_5 value: 26.313 - type: mrr_at_1 value: 23.134 - type: mrr_at_10 value: 32.332 - type: mrr_at_100 value: 33.318 - type: mrr_at_1000 value: 33.384 - type: mrr_at_3 value: 29.664 - type: mrr_at_5 value: 31.262 - type: ndcg_at_1 value: 23.134 - type: ndcg_at_10 value: 33.016 - type: ndcg_at_100 value: 38.763 - type: ndcg_at_1000 value: 41.619 - type: ndcg_at_3 value: 28.017999999999997 - type: ndcg_at_5 value: 30.576999999999998 - type: precision_at_1 value: 23.134 - type: precision_at_10 value: 6.069999999999999 - type: precision_at_100 value: 1.027 - type: precision_at_1000 value: 0.14200000000000002 - type: precision_at_3 value: 13.599 - type: precision_at_5 value: 9.975000000000001 - type: recall_at_1 value: 18.485 - type: recall_at_10 value: 45.39 - type: recall_at_100 value: 69.876 - type: recall_at_1000 value: 90.023 - type: recall_at_3 value: 31.587 - type: recall_at_5 value: 38.164 - task: type: Retrieval dataset: type: BeIR/cqadupstack name: MTEB CQADupstackPhysicsRetrieval config: default split: test revision: None metrics: - type: map_at_1 value: 30.676 - type: map_at_10 value: 41.785 - type: map_at_100 value: 43.169000000000004 - type: map_at_1000 value: 43.272 - type: map_at_3 value: 38.462 - type: map_at_5 value: 40.32 - type: mrr_at_1 value: 37.729 - type: mrr_at_10 value: 47.433 - type: mrr_at_100 value: 48.303000000000004 - type: mrr_at_1000 value: 48.337 - type: mrr_at_3 value: 45.011 - type: mrr_at_5 value: 46.455 - type: ndcg_at_1 value: 37.729 - type: ndcg_at_10 value: 47.921 - type: ndcg_at_100 value: 53.477 - type: ndcg_at_1000 value: 55.300000000000004 - type: ndcg_at_3 value: 42.695 - type: ndcg_at_5 value: 45.175 - type: precision_at_1 value: 37.729 - type: precision_at_10 value: 8.652999999999999 - type: precision_at_100 value: 1.336 - type: precision_at_1000 value: 0.168 - type: precision_at_3 value: 20.18 - type: precision_at_5 value: 14.302000000000001 - type: recall_at_1 value: 30.676 - type: recall_at_10 value: 60.441 - type: recall_at_100 value: 83.37 - type: recall_at_1000 value: 95.092 - type: recall_at_3 value: 45.964 - type: recall_at_5 value: 52.319 - task: type: Retrieval dataset: type: BeIR/cqadupstack name: MTEB CQADupstackProgrammersRetrieval config: default split: test revision: None metrics: - type: map_at_1 value: 24.978 - type: map_at_10 value: 35.926 - type: map_at_100 value: 37.341 - type: map_at_1000 value: 37.445 - type: map_at_3 value: 32.748 - type: map_at_5 value: 34.207 - type: mrr_at_1 value: 31.163999999999998 - type: mrr_at_10 value: 41.394 - type: mrr_at_100 value: 42.321 - type: mrr_at_1000 value: 42.368 - type: mrr_at_3 value: 38.964999999999996 - type: mrr_at_5 value: 40.135 - type: ndcg_at_1 value: 31.163999999999998 - type: ndcg_at_10 value: 42.191 - type: ndcg_at_100 value: 48.083999999999996 - type: ndcg_at_1000 value: 50.21 - type: ndcg_at_3 value: 36.979 - type: ndcg_at_5 value: 38.823 - type: precision_at_1 value: 31.163999999999998 - type: precision_at_10 value: 7.968 - type: precision_at_100 value: 1.2550000000000001 - type: precision_at_1000 value: 0.16199999999999998 - type: precision_at_3 value: 18.075 - type: precision_at_5 value: 12.626000000000001 - type: recall_at_1 value: 24.978 - type: recall_at_10 value: 55.410000000000004 - type: recall_at_100 value: 80.562 - type: recall_at_1000 value: 94.77600000000001 - type: recall_at_3 value: 40.359 - type: recall_at_5 value: 45.577 - task: type: Retrieval dataset: type: BeIR/cqadupstack name: MTEB CQADupstackRetrieval config: default split: test revision: None metrics: - type: map_at_1 value: 26.812166666666666 - type: map_at_10 value: 36.706916666666665 - type: map_at_100 value: 37.94016666666666 - type: map_at_1000 value: 38.05358333333333 - type: map_at_3 value: 33.72408333333334 - type: map_at_5 value: 35.36508333333333 - type: mrr_at_1 value: 31.91516666666667 - type: mrr_at_10 value: 41.09716666666666 - type: mrr_at_100 value: 41.931916666666666 - type: mrr_at_1000 value: 41.98458333333333 - type: mrr_at_3 value: 38.60183333333333 - type: mrr_at_5 value: 40.031916666666675 - type: ndcg_at_1 value: 31.91516666666667 - type: ndcg_at_10 value: 42.38725 - type: ndcg_at_100 value: 47.56291666666667 - type: ndcg_at_1000 value: 49.716499999999996 - type: ndcg_at_3 value: 37.36491666666667 - type: ndcg_at_5 value: 39.692166666666665 - type: precision_at_1 value: 31.91516666666667 - type: precision_at_10 value: 7.476749999999999 - type: precision_at_100 value: 1.1869166666666668 - type: precision_at_1000 value: 0.157 - type: precision_at_3 value: 17.275249999999996 - type: precision_at_5 value: 12.25825 - type: recall_at_1 value: 26.812166666666666 - type: recall_at_10 value: 54.82933333333333 - type: recall_at_100 value: 77.36508333333333 - type: recall_at_1000 value: 92.13366666666667 - type: recall_at_3 value: 40.83508333333334 - type: recall_at_5 value: 46.85083333333334 - task: type: Retrieval dataset: type: BeIR/cqadupstack name: MTEB CQADupstackStatsRetrieval config: default split: test revision: None metrics: - type: map_at_1 value: 25.352999999999998 - type: map_at_10 value: 33.025999999999996 - type: map_at_100 value: 33.882 - type: map_at_1000 value: 33.983999999999995 - type: map_at_3 value: 30.995 - type: map_at_5 value: 32.113 - type: mrr_at_1 value: 28.834 - type: mrr_at_10 value: 36.14 - type: mrr_at_100 value: 36.815 - type: mrr_at_1000 value: 36.893 - type: mrr_at_3 value: 34.305 - type: mrr_at_5 value: 35.263 - type: ndcg_at_1 value: 28.834 - type: ndcg_at_10 value: 37.26 - type: ndcg_at_100 value: 41.723 - type: ndcg_at_1000 value: 44.314 - type: ndcg_at_3 value: 33.584 - type: ndcg_at_5 value: 35.302 - type: precision_at_1 value: 28.834 - type: precision_at_10 value: 5.736 - type: precision_at_100 value: 0.876 - type: precision_at_1000 value: 0.117 - type: precision_at_3 value: 14.468 - type: precision_at_5 value: 9.847 - type: recall_at_1 value: 25.352999999999998 - type: recall_at_10 value: 47.155 - type: recall_at_100 value: 68.024 - type: recall_at_1000 value: 87.26899999999999 - type: recall_at_3 value: 37.074 - type: recall_at_5 value: 41.352 - task: type: Retrieval dataset: type: BeIR/cqadupstack name: MTEB CQADupstackTexRetrieval config: default split: test revision: None metrics: - type: map_at_1 value: 17.845 - type: map_at_10 value: 25.556 - type: map_at_100 value: 26.787 - type: map_at_1000 value: 26.913999999999998 - type: map_at_3 value: 23.075000000000003 - type: map_at_5 value: 24.308 - type: mrr_at_1 value: 21.714 - type: mrr_at_10 value: 29.543999999999997 - type: mrr_at_100 value: 30.543 - type: mrr_at_1000 value: 30.618000000000002 - type: mrr_at_3 value: 27.174 - type: mrr_at_5 value: 28.409000000000002 - type: ndcg_at_1 value: 21.714 - type: ndcg_at_10 value: 30.562 - type: ndcg_at_100 value: 36.27 - type: ndcg_at_1000 value: 39.033 - type: ndcg_at_3 value: 26.006 - type: ndcg_at_5 value: 27.843 - type: precision_at_1 value: 21.714 - type: precision_at_10 value: 5.657 - type: precision_at_100 value: 1 - type: precision_at_1000 value: 0.14100000000000001 - type: precision_at_3 value: 12.4 - type: precision_at_5 value: 8.863999999999999 - type: recall_at_1 value: 17.845 - type: recall_at_10 value: 41.72 - type: recall_at_100 value: 67.06400000000001 - type: recall_at_1000 value: 86.515 - type: recall_at_3 value: 28.78 - type: recall_at_5 value: 33.629999999999995 - task: type: Retrieval dataset: type: BeIR/cqadupstack name: MTEB CQADupstackUnixRetrieval config: default split: test revision: None metrics: - type: map_at_1 value: 26.695 - type: map_at_10 value: 36.205999999999996 - type: map_at_100 value: 37.346000000000004 - type: map_at_1000 value: 37.447 - type: map_at_3 value: 32.84 - type: map_at_5 value: 34.733000000000004 - type: mrr_at_1 value: 31.343 - type: mrr_at_10 value: 40.335 - type: mrr_at_100 value: 41.162 - type: mrr_at_1000 value: 41.221000000000004 - type: mrr_at_3 value: 37.329 - type: mrr_at_5 value: 39.068999999999996 - type: ndcg_at_1 value: 31.343 - type: ndcg_at_10 value: 41.996 - type: ndcg_at_100 value: 47.096 - type: ndcg_at_1000 value: 49.4 - type: ndcg_at_3 value: 35.902 - type: ndcg_at_5 value: 38.848 - type: precision_at_1 value: 31.343 - type: precision_at_10 value: 7.146 - type: precision_at_100 value: 1.098 - type: precision_at_1000 value: 0.14100000000000001 - type: precision_at_3 value: 16.014 - type: precision_at_5 value: 11.735 - type: recall_at_1 value: 26.695 - type: recall_at_10 value: 55.525000000000006 - type: recall_at_100 value: 77.376 - type: recall_at_1000 value: 93.476 - type: recall_at_3 value: 39.439 - type: recall_at_5 value: 46.501 - task: type: Retrieval dataset: type: BeIR/cqadupstack name: MTEB CQADupstackWebmastersRetrieval config: default split: test revision: None metrics: - type: map_at_1 value: 24.196 - type: map_at_10 value: 33.516 - type: map_at_100 value: 35.202 - type: map_at_1000 value: 35.426 - type: map_at_3 value: 30.561 - type: map_at_5 value: 31.961000000000002 - type: mrr_at_1 value: 29.644 - type: mrr_at_10 value: 38.769 - type: mrr_at_100 value: 39.843 - type: mrr_at_1000 value: 39.888 - type: mrr_at_3 value: 36.132999999999996 - type: mrr_at_5 value: 37.467 - type: ndcg_at_1 value: 29.644 - type: ndcg_at_10 value: 39.584 - type: ndcg_at_100 value: 45.964 - type: ndcg_at_1000 value: 48.27 - type: ndcg_at_3 value: 34.577999999999996 - type: ndcg_at_5 value: 36.498000000000005 - type: precision_at_1 value: 29.644 - type: precision_at_10 value: 7.668 - type: precision_at_100 value: 1.545 - type: precision_at_1000 value: 0.242 - type: precision_at_3 value: 16.271 - type: precision_at_5 value: 11.620999999999999 - type: recall_at_1 value: 24.196 - type: recall_at_10 value: 51.171 - type: recall_at_100 value: 79.212 - type: recall_at_1000 value: 92.976 - type: recall_at_3 value: 36.797999999999995 - type: recall_at_5 value: 42.006 - task: type: Retrieval dataset: type: BeIR/cqadupstack name: MTEB CQADupstackWordpressRetrieval config: default split: test revision: None metrics: - type: map_at_1 value: 21.023 - type: map_at_10 value: 29.677 - type: map_at_100 value: 30.618000000000002 - type: map_at_1000 value: 30.725 - type: map_at_3 value: 27.227 - type: map_at_5 value: 28.523 - type: mrr_at_1 value: 22.921 - type: mrr_at_10 value: 31.832 - type: mrr_at_100 value: 32.675 - type: mrr_at_1000 value: 32.751999999999995 - type: mrr_at_3 value: 29.513 - type: mrr_at_5 value: 30.89 - type: ndcg_at_1 value: 22.921 - type: ndcg_at_10 value: 34.699999999999996 - type: ndcg_at_100 value: 39.302 - type: ndcg_at_1000 value: 41.919000000000004 - type: ndcg_at_3 value: 29.965999999999998 - type: ndcg_at_5 value: 32.22 - type: precision_at_1 value: 22.921 - type: precision_at_10 value: 5.564 - type: precision_at_100 value: 0.8340000000000001 - type: precision_at_1000 value: 0.11800000000000001 - type: precision_at_3 value: 13.123999999999999 - type: precision_at_5 value: 9.316 - type: recall_at_1 value: 21.023 - type: recall_at_10 value: 48.015 - type: recall_at_100 value: 68.978 - type: recall_at_1000 value: 88.198 - type: recall_at_3 value: 35.397 - type: recall_at_5 value: 40.701 - task: type: Retrieval dataset: type: climate-fever name: MTEB ClimateFEVER config: default split: test revision: None metrics: - type: map_at_1 value: 11.198 - type: map_at_10 value: 19.336000000000002 - type: map_at_100 value: 21.382 - type: map_at_1000 value: 21.581 - type: map_at_3 value: 15.992 - type: map_at_5 value: 17.613 - type: mrr_at_1 value: 25.080999999999996 - type: mrr_at_10 value: 36.032 - type: mrr_at_100 value: 37.1 - type: mrr_at_1000 value: 37.145 - type: mrr_at_3 value: 32.595 - type: mrr_at_5 value: 34.553 - type: ndcg_at_1 value: 25.080999999999996 - type: ndcg_at_10 value: 27.290999999999997 - type: ndcg_at_100 value: 35.31 - type: ndcg_at_1000 value: 38.885 - type: ndcg_at_3 value: 21.895999999999997 - type: ndcg_at_5 value: 23.669999999999998 - type: precision_at_1 value: 25.080999999999996 - type: precision_at_10 value: 8.645 - type: precision_at_100 value: 1.7209999999999999 - type: precision_at_1000 value: 0.23900000000000002 - type: precision_at_3 value: 16.287 - type: precision_at_5 value: 12.625 - type: recall_at_1 value: 11.198 - type: recall_at_10 value: 33.355000000000004 - type: recall_at_100 value: 60.912 - type: recall_at_1000 value: 80.89 - type: recall_at_3 value: 20.055 - type: recall_at_5 value: 25.14 - task: type: Retrieval dataset: type: dbpedia-entity name: MTEB DBPedia config: default split: test revision: None metrics: - type: map_at_1 value: 9.228 - type: map_at_10 value: 20.018 - type: map_at_100 value: 28.388999999999996 - type: map_at_1000 value: 30.073 - type: map_at_3 value: 14.366999999999999 - type: map_at_5 value: 16.705000000000002 - type: mrr_at_1 value: 69 - type: mrr_at_10 value: 77.058 - type: mrr_at_100 value: 77.374 - type: mrr_at_1000 value: 77.384 - type: mrr_at_3 value: 75.708 - type: mrr_at_5 value: 76.608 - type: ndcg_at_1 value: 57.49999999999999 - type: ndcg_at_10 value: 41.792 - type: ndcg_at_100 value: 47.374 - type: ndcg_at_1000 value: 55.13 - type: ndcg_at_3 value: 46.353 - type: ndcg_at_5 value: 43.702000000000005 - type: precision_at_1 value: 69 - type: precision_at_10 value: 32.85 - type: precision_at_100 value: 10.708 - type: precision_at_1000 value: 2.024 - type: precision_at_3 value: 49.5 - type: precision_at_5 value: 42.05 - type: recall_at_1 value: 9.228 - type: recall_at_10 value: 25.635 - type: recall_at_100 value: 54.894 - type: recall_at_1000 value: 79.38 - type: recall_at_3 value: 15.68 - type: recall_at_5 value: 19.142 - task: type: Classification dataset: type: mteb/emotion name: MTEB EmotionClassification config: default split: test revision: 4f58c6b202a23cf9a4da393831edf4f9183cad37 metrics: - type: accuracy value: 52.035 - type: f1 value: 46.85325505614071 - task: type: Retrieval dataset: type: fever name: MTEB FEVER config: default split: test revision: None metrics: - type: map_at_1 value: 70.132 - type: map_at_10 value: 79.527 - type: map_at_100 value: 79.81200000000001 - type: map_at_1000 value: 79.828 - type: map_at_3 value: 78.191 - type: map_at_5 value: 79.092 - type: mrr_at_1 value: 75.563 - type: mrr_at_10 value: 83.80199999999999 - type: mrr_at_100 value: 83.93 - type: mrr_at_1000 value: 83.933 - type: mrr_at_3 value: 82.818 - type: mrr_at_5 value: 83.505 - type: ndcg_at_1 value: 75.563 - type: ndcg_at_10 value: 83.692 - type: ndcg_at_100 value: 84.706 - type: ndcg_at_1000 value: 85.001 - type: ndcg_at_3 value: 81.51 - type: ndcg_at_5 value: 82.832 - type: precision_at_1 value: 75.563 - type: precision_at_10 value: 10.245 - type: precision_at_100 value: 1.0959999999999999 - type: precision_at_1000 value: 0.11399999999999999 - type: precision_at_3 value: 31.518 - type: precision_at_5 value: 19.772000000000002 - type: recall_at_1 value: 70.132 - type: recall_at_10 value: 92.204 - type: recall_at_100 value: 96.261 - type: recall_at_1000 value: 98.17399999999999 - type: recall_at_3 value: 86.288 - type: recall_at_5 value: 89.63799999999999 - task: type: Retrieval dataset: type: fiqa name: MTEB FiQA2018 config: default split: test revision: None metrics: - type: map_at_1 value: 22.269 - type: map_at_10 value: 36.042 - type: map_at_100 value: 37.988 - type: map_at_1000 value: 38.162 - type: map_at_3 value: 31.691000000000003 - type: map_at_5 value: 33.988 - type: mrr_at_1 value: 44.907000000000004 - type: mrr_at_10 value: 53.348 - type: mrr_at_100 value: 54.033 - type: mrr_at_1000 value: 54.064 - type: mrr_at_3 value: 50.977 - type: mrr_at_5 value: 52.112 - type: ndcg_at_1 value: 44.907000000000004 - type: ndcg_at_10 value: 44.302 - type: ndcg_at_100 value: 51.054 - type: ndcg_at_1000 value: 53.822 - type: ndcg_at_3 value: 40.615 - type: ndcg_at_5 value: 41.455999999999996 - type: precision_at_1 value: 44.907000000000004 - type: precision_at_10 value: 12.176 - type: precision_at_100 value: 1.931 - type: precision_at_1000 value: 0.243 - type: precision_at_3 value: 27.16 - type: precision_at_5 value: 19.567999999999998 - type: recall_at_1 value: 22.269 - type: recall_at_10 value: 51.188 - type: recall_at_100 value: 75.924 - type: recall_at_1000 value: 92.525 - type: recall_at_3 value: 36.643 - type: recall_at_5 value: 42.27 - task: type: Retrieval dataset: type: hotpotqa name: MTEB HotpotQA config: default split: test revision: None metrics: - type: map_at_1 value: 40.412 - type: map_at_10 value: 66.376 - type: map_at_100 value: 67.217 - type: map_at_1000 value: 67.271 - type: map_at_3 value: 62.741 - type: map_at_5 value: 65.069 - type: mrr_at_1 value: 80.824 - type: mrr_at_10 value: 86.53 - type: mrr_at_100 value: 86.67399999999999 - type: mrr_at_1000 value: 86.678 - type: mrr_at_3 value: 85.676 - type: mrr_at_5 value: 86.256 - type: ndcg_at_1 value: 80.824 - type: ndcg_at_10 value: 74.332 - type: ndcg_at_100 value: 77.154 - type: ndcg_at_1000 value: 78.12400000000001 - type: ndcg_at_3 value: 69.353 - type: ndcg_at_5 value: 72.234 - type: precision_at_1 value: 80.824 - type: precision_at_10 value: 15.652 - type: precision_at_100 value: 1.7840000000000003 - type: precision_at_1000 value: 0.191 - type: precision_at_3 value: 44.911 - type: precision_at_5 value: 29.221000000000004 - type: recall_at_1 value: 40.412 - type: recall_at_10 value: 78.25800000000001 - type: recall_at_100 value: 89.196 - type: recall_at_1000 value: 95.544 - type: recall_at_3 value: 67.367 - type: recall_at_5 value: 73.05199999999999 - task: type: Classification dataset: type: mteb/imdb name: MTEB ImdbClassification config: default split: test revision: 3d86128a09e091d6018b6d26cad27f2739fc2db7 metrics: - type: accuracy value: 92.78880000000001 - type: ap value: 89.39251741048801 - type: f1 value: 92.78019950076781 - task: type: Retrieval dataset: type: msmarco name: MTEB MSMARCO config: default split: dev revision: None metrics: - type: map_at_1 value: 22.888 - type: map_at_10 value: 35.146 - type: map_at_100 value: 36.325 - type: map_at_1000 value: 36.372 - type: map_at_3 value: 31.3 - type: map_at_5 value: 33.533 - type: mrr_at_1 value: 23.480999999999998 - type: mrr_at_10 value: 35.777 - type: mrr_at_100 value: 36.887 - type: mrr_at_1000 value: 36.928 - type: mrr_at_3 value: 31.989 - type: mrr_at_5 value: 34.202 - type: ndcg_at_1 value: 23.496 - type: ndcg_at_10 value: 42.028999999999996 - type: ndcg_at_100 value: 47.629 - type: ndcg_at_1000 value: 48.785000000000004 - type: ndcg_at_3 value: 34.227000000000004 - type: ndcg_at_5 value: 38.207 - type: precision_at_1 value: 23.496 - type: precision_at_10 value: 6.596 - type: precision_at_100 value: 0.9400000000000001 - type: precision_at_1000 value: 0.104 - type: precision_at_3 value: 14.513000000000002 - type: precision_at_5 value: 10.711 - type: recall_at_1 value: 22.888 - type: recall_at_10 value: 63.129999999999995 - type: recall_at_100 value: 88.90299999999999 - type: recall_at_1000 value: 97.69 - type: recall_at_3 value: 42.014 - type: recall_at_5 value: 51.554 - task: type: Classification dataset: type: mteb/mtop_domain name: MTEB MTOPDomainClassification (en) config: en split: test revision: d80d48c1eb48d3562165c59d59d0034df9fff0bf metrics: - type: accuracy value: 94.59188326493388 - type: f1 value: 94.36568950290486 - task: type: Classification dataset: type: mteb/mtop_intent name: MTEB MTOPIntentClassification (en) config: en split: test revision: ae001d0e6b1228650b7bd1c2c65fb50ad11a8aba metrics: - type: accuracy value: 79.25672594619242 - type: f1 value: 59.52405059722216 - task: type: Classification dataset: type: mteb/amazon_massive_intent name: MTEB MassiveIntentClassification (en) config: en split: test revision: 31efe3c427b0bae9c22cbb560b8f15491cc6bed7 metrics: - type: accuracy value: 77.4142568930733 - type: f1 value: 75.23044196543388 - task: type: Classification dataset: type: mteb/amazon_massive_scenario name: MTEB MassiveScenarioClassification (en) config: en split: test revision: 7d571f92784cd94a019292a1f45445077d0ef634 metrics: - type: accuracy value: 80.44720914593141 - type: f1 value: 80.41049641537015 - task: type: Clustering dataset: type: mteb/medrxiv-clustering-p2p name: MTEB MedrxivClusteringP2P config: default split: test revision: e7a26af6f3ae46b30dde8737f02c07b1505bcc73 metrics: - type: v_measure value: 31.960921474993775 - task: type: Clustering dataset: type: mteb/medrxiv-clustering-s2s name: MTEB MedrxivClusteringS2S config: default split: test revision: 35191c8c0dca72d8ff3efcd72aa802307d469663 metrics: - type: v_measure value: 30.88042240204361 - task: type: Reranking dataset: type: mteb/mind_small name: MTEB MindSmallReranking config: default split: test revision: 3bdac13927fdc888b903db93b2ffdbd90b295a69 metrics: - type: map value: 32.27071371606404 - type: mrr value: 33.541450459533856 - task: type: Retrieval dataset: type: nfcorpus name: MTEB NFCorpus config: default split: test revision: None metrics: - type: map_at_1 value: 6.551 - type: map_at_10 value: 14.359 - type: map_at_100 value: 18.157 - type: map_at_1000 value: 19.659 - type: map_at_3 value: 10.613999999999999 - type: map_at_5 value: 12.296 - type: mrr_at_1 value: 47.368 - type: mrr_at_10 value: 56.689 - type: mrr_at_100 value: 57.24399999999999 - type: mrr_at_1000 value: 57.284 - type: mrr_at_3 value: 54.489 - type: mrr_at_5 value: 55.928999999999995 - type: ndcg_at_1 value: 45.511 - type: ndcg_at_10 value: 36.911 - type: ndcg_at_100 value: 34.241 - type: ndcg_at_1000 value: 43.064 - type: ndcg_at_3 value: 42.348 - type: ndcg_at_5 value: 39.884 - type: precision_at_1 value: 46.749 - type: precision_at_10 value: 27.028000000000002 - type: precision_at_100 value: 8.52 - type: precision_at_1000 value: 2.154 - type: precision_at_3 value: 39.525 - type: precision_at_5 value: 34.18 - type: recall_at_1 value: 6.551 - type: recall_at_10 value: 18.602 - type: recall_at_100 value: 34.882999999999996 - type: recall_at_1000 value: 66.049 - type: recall_at_3 value: 11.872 - type: recall_at_5 value: 14.74 - task: type: Retrieval dataset: type: nq name: MTEB NQ config: default split: test revision: None metrics: - type: map_at_1 value: 27.828999999999997 - type: map_at_10 value: 43.606 - type: map_at_100 value: 44.656 - type: map_at_1000 value: 44.690000000000005 - type: map_at_3 value: 39.015 - type: map_at_5 value: 41.625 - type: mrr_at_1 value: 31.518 - type: mrr_at_10 value: 46.047 - type: mrr_at_100 value: 46.846 - type: mrr_at_1000 value: 46.867999999999995 - type: mrr_at_3 value: 42.154 - type: mrr_at_5 value: 44.468999999999994 - type: ndcg_at_1 value: 31.518 - type: ndcg_at_10 value: 51.768 - type: ndcg_at_100 value: 56.184999999999995 - type: ndcg_at_1000 value: 56.92 - type: ndcg_at_3 value: 43.059999999999995 - type: ndcg_at_5 value: 47.481 - type: precision_at_1 value: 31.518 - type: precision_at_10 value: 8.824 - type: precision_at_100 value: 1.131 - type: precision_at_1000 value: 0.12 - type: precision_at_3 value: 19.969 - type: precision_at_5 value: 14.502 - type: recall_at_1 value: 27.828999999999997 - type: recall_at_10 value: 74.244 - type: recall_at_100 value: 93.325 - type: recall_at_1000 value: 98.71799999999999 - type: recall_at_3 value: 51.601 - type: recall_at_5 value: 61.841 - task: type: Retrieval dataset: type: quora name: MTEB QuoraRetrieval config: default split: test revision: None metrics: - type: map_at_1 value: 71.54 - type: map_at_10 value: 85.509 - type: map_at_100 value: 86.137 - type: map_at_1000 value: 86.151 - type: map_at_3 value: 82.624 - type: map_at_5 value: 84.425 - type: mrr_at_1 value: 82.45 - type: mrr_at_10 value: 88.344 - type: mrr_at_100 value: 88.437 - type: mrr_at_1000 value: 88.437 - type: mrr_at_3 value: 87.417 - type: mrr_at_5 value: 88.066 - type: ndcg_at_1 value: 82.45 - type: ndcg_at_10 value: 89.092 - type: ndcg_at_100 value: 90.252 - type: ndcg_at_1000 value: 90.321 - type: ndcg_at_3 value: 86.404 - type: ndcg_at_5 value: 87.883 - type: precision_at_1 value: 82.45 - type: precision_at_10 value: 13.496 - type: precision_at_100 value: 1.536 - type: precision_at_1000 value: 0.157 - type: precision_at_3 value: 37.833 - type: precision_at_5 value: 24.79 - type: recall_at_1 value: 71.54 - type: recall_at_10 value: 95.846 - type: recall_at_100 value: 99.715 - type: recall_at_1000 value: 99.979 - type: recall_at_3 value: 88.01299999999999 - type: recall_at_5 value: 92.32000000000001 - task: type: Clustering dataset: type: mteb/reddit-clustering name: MTEB RedditClustering config: default split: test revision: 24640382cdbf8abc73003fb0fa6d111a705499eb metrics: - type: v_measure value: 57.60557586253866 - task: type: Clustering dataset: type: mteb/reddit-clustering-p2p name: MTEB RedditClusteringP2P config: default split: test revision: 282350215ef01743dc01b456c7f5241fa8937f16 metrics: - type: v_measure value: 64.0287172242051 - task: type: Retrieval dataset: type: scidocs name: MTEB SCIDOCS config: default split: test revision: None metrics: - type: map_at_1 value: 3.9849999999999994 - type: map_at_10 value: 11.397 - type: map_at_100 value: 13.985 - type: map_at_1000 value: 14.391000000000002 - type: map_at_3 value: 7.66 - type: map_at_5 value: 9.46 - type: mrr_at_1 value: 19.8 - type: mrr_at_10 value: 31.958 - type: mrr_at_100 value: 33.373999999999995 - type: mrr_at_1000 value: 33.411 - type: mrr_at_3 value: 28.316999999999997 - type: mrr_at_5 value: 30.297 - type: ndcg_at_1 value: 19.8 - type: ndcg_at_10 value: 19.580000000000002 - type: ndcg_at_100 value: 29.555999999999997 - type: ndcg_at_1000 value: 35.882 - type: ndcg_at_3 value: 17.544 - type: ndcg_at_5 value: 15.815999999999999 - type: precision_at_1 value: 19.8 - type: precision_at_10 value: 10.61 - type: precision_at_100 value: 2.501 - type: precision_at_1000 value: 0.40099999999999997 - type: precision_at_3 value: 16.900000000000002 - type: precision_at_5 value: 14.44 - type: recall_at_1 value: 3.9849999999999994 - type: recall_at_10 value: 21.497 - type: recall_at_100 value: 50.727999999999994 - type: recall_at_1000 value: 81.27499999999999 - type: recall_at_3 value: 10.263 - type: recall_at_5 value: 14.643 - task: type: STS dataset: type: mteb/sickr-sts name: MTEB SICK-R config: default split: test revision: a6ea5a8cab320b040a23452cc28066d9beae2cee metrics: - type: cos_sim_pearson value: 85.0087509585503 - type: cos_sim_spearman value: 81.74697270664319 - type: euclidean_pearson value: 81.80424382731947 - type: euclidean_spearman value: 81.29794251968431 - type: manhattan_pearson value: 81.81524666226125 - type: manhattan_spearman value: 81.29475370198963 - task: type: STS dataset: type: mteb/sts12-sts name: MTEB STS12 config: default split: test revision: a0d554a64d88156834ff5ae9920b964011b16384 metrics: - type: cos_sim_pearson value: 86.44442736429552 - type: cos_sim_spearman value: 78.51011398910948 - type: euclidean_pearson value: 83.36181801196723 - type: euclidean_spearman value: 79.47272621331535 - type: manhattan_pearson value: 83.3660113483837 - type: manhattan_spearman value: 79.47695922566032 - task: type: STS dataset: type: mteb/sts13-sts name: MTEB STS13 config: default split: test revision: 7e90230a92c190f1bf69ae9002b8cea547a64cca metrics: - type: cos_sim_pearson value: 85.82923943323635 - type: cos_sim_spearman value: 86.62037823380983 - type: euclidean_pearson value: 83.56369548403958 - type: euclidean_spearman value: 84.2176755481191 - type: manhattan_pearson value: 83.55460702084464 - type: manhattan_spearman value: 84.18617930921467 - task: type: STS dataset: type: mteb/sts14-sts name: MTEB STS14 config: default split: test revision: 6031580fec1f6af667f0bd2da0a551cf4f0b2375 metrics: - type: cos_sim_pearson value: 84.09071068110103 - type: cos_sim_spearman value: 83.05697553913335 - type: euclidean_pearson value: 81.1377457216497 - type: euclidean_spearman value: 81.74714169016676 - type: manhattan_pearson value: 81.0893424142723 - type: manhattan_spearman value: 81.7058918219677 - task: type: STS dataset: type: mteb/sts15-sts name: MTEB STS15 config: default split: test revision: ae752c7c21bf194d8b67fd573edf7ae58183cbe3 metrics: - type: cos_sim_pearson value: 87.61132157220429 - type: cos_sim_spearman value: 88.38581627185445 - type: euclidean_pearson value: 86.14904510913374 - type: euclidean_spearman value: 86.5452758925542 - type: manhattan_pearson value: 86.1484025377679 - type: manhattan_spearman value: 86.55483841566252 - task: type: STS dataset: type: mteb/sts16-sts name: MTEB STS16 config: default split: test revision: 4d8694f8f0e0100860b497b999b3dbed754a0513 metrics: - type: cos_sim_pearson value: 85.46195145161064 - type: cos_sim_spearman value: 86.82409112251158 - type: euclidean_pearson value: 84.75479672288957 - type: euclidean_spearman value: 85.41144307151548 - type: manhattan_pearson value: 84.70914329694165 - type: manhattan_spearman value: 85.38477943384089 - task: type: STS dataset: type: mteb/sts17-crosslingual-sts name: MTEB STS17 (en-en) config: en-en split: test revision: af5e6fb845001ecf41f4c1e033ce921939a2a68d metrics: - type: cos_sim_pearson value: 88.06351289930238 - type: cos_sim_spearman value: 87.90311138579116 - type: euclidean_pearson value: 86.17651467063077 - type: euclidean_spearman value: 84.89447802019073 - type: manhattan_pearson value: 86.3267677479595 - type: manhattan_spearman value: 85.00472295103874 - task: type: STS dataset: type: mteb/sts22-crosslingual-sts name: MTEB STS22 (en) config: en split: test revision: 6d1ba47164174a496b7fa5d3569dae26a6813b80 metrics: - type: cos_sim_pearson value: 67.78311975978767 - type: cos_sim_spearman value: 66.76465685245887 - type: euclidean_pearson value: 67.21687806595443 - type: euclidean_spearman value: 65.05776733534435 - type: manhattan_pearson value: 67.14008143635883 - type: manhattan_spearman value: 65.25247076149701 - task: type: STS dataset: type: mteb/stsbenchmark-sts name: MTEB STSBenchmark config: default split: test revision: b0fddb56ed78048fa8b90373c8a3cfc37b684831 metrics: - type: cos_sim_pearson value: 86.7403488889418 - type: cos_sim_spearman value: 87.76870289783061 - type: euclidean_pearson value: 84.83171077794671 - type: euclidean_spearman value: 85.50579695091902 - type: manhattan_pearson value: 84.83074260180555 - type: manhattan_spearman value: 85.47589026938667 - task: type: Reranking dataset: type: mteb/scidocs-reranking name: MTEB SciDocsRR config: default split: test revision: d3c5e1fc0b855ab6097bf1cda04dd73947d7caab metrics: - type: map value: 87.56234016237356 - type: mrr value: 96.26124238869338 - task: type: Retrieval dataset: type: scifact name: MTEB SciFact config: default split: test revision: None metrics: - type: map_at_1 value: 59.660999999999994 - type: map_at_10 value: 69.105 - type: map_at_100 value: 69.78 - type: map_at_1000 value: 69.80199999999999 - type: map_at_3 value: 65.991 - type: map_at_5 value: 68.02 - type: mrr_at_1 value: 62.666999999999994 - type: mrr_at_10 value: 70.259 - type: mrr_at_100 value: 70.776 - type: mrr_at_1000 value: 70.796 - type: mrr_at_3 value: 67.889 - type: mrr_at_5 value: 69.52199999999999 - type: ndcg_at_1 value: 62.666999999999994 - type: ndcg_at_10 value: 73.425 - type: ndcg_at_100 value: 75.955 - type: ndcg_at_1000 value: 76.459 - type: ndcg_at_3 value: 68.345 - type: ndcg_at_5 value: 71.319 - type: precision_at_1 value: 62.666999999999994 - type: precision_at_10 value: 9.667 - type: precision_at_100 value: 1.09 - type: precision_at_1000 value: 0.11299999999999999 - type: precision_at_3 value: 26.333000000000002 - type: precision_at_5 value: 17.732999999999997 - type: recall_at_1 value: 59.660999999999994 - type: recall_at_10 value: 85.422 - type: recall_at_100 value: 96.167 - type: recall_at_1000 value: 100 - type: recall_at_3 value: 72.044 - type: recall_at_5 value: 79.428 - task: type: PairClassification dataset: type: mteb/sprintduplicatequestions-pairclassification name: MTEB SprintDuplicateQuestions config: default split: test revision: d66bd1f72af766a5cc4b0ca5e00c162f89e8cc46 metrics: - type: cos_sim_accuracy value: 99.86435643564356 - type: cos_sim_ap value: 96.83057412333741 - type: cos_sim_f1 value: 93.04215337734891 - type: cos_sim_precision value: 94.53044375644994 - type: cos_sim_recall value: 91.60000000000001 - type: dot_accuracy value: 99.7910891089109 - type: dot_ap value: 94.10681982106397 - type: dot_f1 value: 89.34881373043918 - type: dot_precision value: 90.21406727828746 - type: dot_recall value: 88.5 - type: euclidean_accuracy value: 99.85544554455446 - type: euclidean_ap value: 96.78545104478602 - type: euclidean_f1 value: 92.65143992055613 - type: euclidean_precision value: 92.01183431952663 - type: euclidean_recall value: 93.30000000000001 - type: manhattan_accuracy value: 99.85841584158416 - type: manhattan_ap value: 96.80748903307823 - type: manhattan_f1 value: 92.78247884519662 - type: manhattan_precision value: 92.36868186323092 - type: manhattan_recall value: 93.2 - type: max_accuracy value: 99.86435643564356 - type: max_ap value: 96.83057412333741 - type: max_f1 value: 93.04215337734891 - task: type: Clustering dataset: type: mteb/stackexchange-clustering name: MTEB StackExchangeClustering config: default split: test revision: 6cbc1f7b2bc0622f2e39d2c77fa502909748c259 metrics: - type: v_measure value: 65.53971025855282 - task: type: Clustering dataset: type: mteb/stackexchange-clustering-p2p name: MTEB StackExchangeClusteringP2P config: default split: test revision: 815ca46b2622cec33ccafc3735d572c266efdb44 metrics: - type: v_measure value: 33.97791591490788 - task: type: Reranking dataset: type: mteb/stackoverflowdupquestions-reranking name: MTEB StackOverflowDupQuestions config: default split: test revision: e185fbe320c72810689fc5848eb6114e1ef5ec69 metrics: - type: map value: 55.852215301355066 - type: mrr value: 56.85527809608691 - task: type: Summarization dataset: type: mteb/summeval name: MTEB SummEval config: default split: test revision: cda12ad7615edc362dbf25a00fdd61d3b1eaf93c metrics: - type: cos_sim_pearson value: 31.21442519856758 - type: cos_sim_spearman value: 30.822536216936825 - type: dot_pearson value: 28.661325528121807 - type: dot_spearman value: 28.1435226478879 - task: type: Retrieval dataset: type: trec-covid name: MTEB TRECCOVID config: default split: test revision: None metrics: - type: map_at_1 value: 0.183 - type: map_at_10 value: 1.526 - type: map_at_100 value: 7.915 - type: map_at_1000 value: 19.009 - type: map_at_3 value: 0.541 - type: map_at_5 value: 0.8659999999999999 - type: mrr_at_1 value: 68 - type: mrr_at_10 value: 81.186 - type: mrr_at_100 value: 81.186 - type: mrr_at_1000 value: 81.186 - type: mrr_at_3 value: 80 - type: mrr_at_5 value: 80.9 - type: ndcg_at_1 value: 64 - type: ndcg_at_10 value: 64.13799999999999 - type: ndcg_at_100 value: 47.632000000000005 - type: ndcg_at_1000 value: 43.037 - type: ndcg_at_3 value: 67.542 - type: ndcg_at_5 value: 67.496 - type: precision_at_1 value: 68 - type: precision_at_10 value: 67.80000000000001 - type: precision_at_100 value: 48.980000000000004 - type: precision_at_1000 value: 19.036 - type: precision_at_3 value: 72 - type: precision_at_5 value: 71.2 - type: recall_at_1 value: 0.183 - type: recall_at_10 value: 1.799 - type: recall_at_100 value: 11.652999999999999 - type: recall_at_1000 value: 40.086 - type: recall_at_3 value: 0.5930000000000001 - type: recall_at_5 value: 0.983 - task: type: Retrieval dataset: type: webis-touche2020 name: MTEB Touche2020 config: default split: test revision: None metrics: - type: map_at_1 value: 2.29 - type: map_at_10 value: 9.489 - type: map_at_100 value: 15.051 - type: map_at_1000 value: 16.561999999999998 - type: map_at_3 value: 5.137 - type: map_at_5 value: 6.7989999999999995 - type: mrr_at_1 value: 28.571 - type: mrr_at_10 value: 45.699 - type: mrr_at_100 value: 46.461000000000006 - type: mrr_at_1000 value: 46.461000000000006 - type: mrr_at_3 value: 41.837 - type: mrr_at_5 value: 43.163000000000004 - type: ndcg_at_1 value: 23.469 - type: ndcg_at_10 value: 23.544999999999998 - type: ndcg_at_100 value: 34.572 - type: ndcg_at_1000 value: 46.035 - type: ndcg_at_3 value: 27.200000000000003 - type: ndcg_at_5 value: 25.266 - type: precision_at_1 value: 28.571 - type: precision_at_10 value: 22.041 - type: precision_at_100 value: 7.3469999999999995 - type: precision_at_1000 value: 1.484 - type: precision_at_3 value: 29.932 - type: precision_at_5 value: 26.531 - type: recall_at_1 value: 2.29 - type: recall_at_10 value: 15.895999999999999 - type: recall_at_100 value: 45.518 - type: recall_at_1000 value: 80.731 - type: recall_at_3 value: 6.433 - type: recall_at_5 value: 9.484 - task: type: Classification dataset: type: mteb/toxic_conversations_50k name: MTEB ToxicConversationsClassification config: default split: test revision: d7c0de2777da35d6aae2200a62c6e0e5af397c4c metrics: - type: accuracy value: 71.4178 - type: ap value: 14.575240629602373 - type: f1 value: 55.02449563229096 - task: type: Classification dataset: type: mteb/tweet_sentiment_extraction name: MTEB TweetSentimentExtractionClassification config: default split: test revision: d604517c81ca91fe16a244d1248fc021f9ecee7a metrics: - type: accuracy value: 60.00282965478212 - type: f1 value: 60.34413028768773 - task: type: Clustering dataset: type: mteb/twentynewsgroups-clustering name: MTEB TwentyNewsgroupsClustering config: default split: test revision: 6125ec4e24fa026cec8a478383ee943acfbd5449 metrics: - type: v_measure value: 50.409448342549936 - task: type: PairClassification dataset: type: mteb/twittersemeval2015-pairclassification name: MTEB TwitterSemEval2015 config: default split: test revision: 70970daeab8776df92f5ea462b6173c0b46fd2d1 metrics: - type: cos_sim_accuracy value: 87.62591643321214 - type: cos_sim_ap value: 79.28766491329633 - type: cos_sim_f1 value: 71.98772064466617 - type: cos_sim_precision value: 69.8609731876862 - type: cos_sim_recall value: 74.24802110817942 - type: dot_accuracy value: 84.75293556654945 - type: dot_ap value: 69.72705761174353 - type: dot_f1 value: 65.08692852543464 - type: dot_precision value: 63.57232704402516 - type: dot_recall value: 66.6754617414248 - type: euclidean_accuracy value: 87.44710019669786 - type: euclidean_ap value: 79.11021477292638 - type: euclidean_f1 value: 71.5052389470994 - type: euclidean_precision value: 69.32606541129832 - type: euclidean_recall value: 73.82585751978891 - type: manhattan_accuracy value: 87.42325803182929 - type: manhattan_ap value: 79.05094494327616 - type: manhattan_f1 value: 71.36333985649055 - type: manhattan_precision value: 70.58064516129032 - type: manhattan_recall value: 72.16358839050132 - type: max_accuracy value: 87.62591643321214 - type: max_ap value: 79.28766491329633 - type: max_f1 value: 71.98772064466617 - task: type: PairClassification dataset: type: mteb/twitterurlcorpus-pairclassification name: MTEB TwitterURLCorpus config: default split: test revision: 8b6510b0b1fa4e4c4f879467980e9be563ec1cdf metrics: - type: cos_sim_accuracy value: 88.85202002561415 - type: cos_sim_ap value: 85.9835303311168 - type: cos_sim_f1 value: 78.25741142443962 - type: cos_sim_precision value: 73.76635768811342 - type: cos_sim_recall value: 83.3307668617185 - type: dot_accuracy value: 88.20584468506229 - type: dot_ap value: 83.591632302697 - type: dot_f1 value: 76.81739705396173 - type: dot_precision value: 73.45275728837373 - type: dot_recall value: 80.50508161379734 - type: euclidean_accuracy value: 88.64633057787093 - type: euclidean_ap value: 85.25705123182283 - type: euclidean_f1 value: 77.18535726329199 - type: euclidean_precision value: 75.17699437997226 - type: euclidean_recall value: 79.30397289805975 - type: manhattan_accuracy value: 88.63274731245392 - type: manhattan_ap value: 85.2376825633018 - type: manhattan_f1 value: 77.15810785937788 - type: manhattan_precision value: 73.92255061014319 - type: manhattan_recall value: 80.68986757006468 - type: max_accuracy value: 88.85202002561415 - type: max_ap value: 85.9835303311168 - type: max_f1 value: 78.25741142443962 ---
**Supported languages:** English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai. **Llama 3.1 family of models**. Token counts refer to pretraining data only. All model versions use Grouped-Query Attention (GQA) for improved inference scalability. **Model Release Date:** July 23, 2024. **Status:** This is a static model trained on an offline dataset. Future versions of the tuned models will be released as we improve model safety with community feedback. **License:** A custom commercial license, the Llama 3.1 Community License, is available at: Where to send questions or comments about the model Instructions on how to provide feedback or comments on the model can be found in the model README. For more technical information about generation parameters and recipes for how to use Llama 3.1 in applications, please go here. ## Intended Use **Intended Use Cases** Llama 3.1 is intended for commercial and research use in multiple languages. Instruction tuned text only models are intended for assistant-like chat, whereas pretrained models can be adapted for a variety of natural language generation tasks. The Llama 3.1 model collection also supports the ability to leverage the outputs of its models to improve other models including synthetic data generation and distillation. The Llama 3.1 Community License allows for these use cases. **Out-of-scope** Use in any manner that violates applicable laws or regulations (including trade compliance laws). Use in any other way that is prohibited by the Acceptable Use Policy and Llama 3.1 Community License. Use in languages beyond those explicitly referenced as supported in this model card**. **Note: Llama 3.1 has been trained on a broader collection of languages than the 8 supported languages. Developers may fine-tune Llama 3.1 models for languages beyond the 8 supported languages provided they comply with the Llama 3.1 Community License and the Acceptable Use Policy and in such cases are responsible for ensuring that any uses of Llama 3.1 in additional languages is done in a safe and responsible manner. ## How to use This repository contains two versions of Meta-Llama-3.1-70B-Instruct, for use with transformers and with the original codebase. ### Use with transformers Starting with onward, you can run conversational inference using the Transformers abstraction or by leveraging the Auto classes with the function. Make sure to update your transformers installation via . See the snippet below for usage with Transformers: ### Tool use with transformers LLaMA-3.1 supports multiple tool use formats. You can see a full guide to prompt formatting here. Tool use is also supported through chat templates in Transformers. Here is a quick example showing a single simple tool: You can then generate text from this input as normal. If the model generates a tool call, you should add it to the chat like so: and then call the tool and append the result, with the role, like so: After that, you can again to let the model use the tool result in the chat. Note that this was a very brief introduction to tool calling - for more information, see the LLaMA prompt format docs and the Transformers tool use documentation. ### Use with The model checkpoints can be used in and for further memory optimisations using and See the snippet below for usage: To load in 4-bit simply pass ### Use with Please, follow the instructions in the repository. To download Original checkpoints, see the example command below leveraging : ## Hardware and Software **Training Factors** We used custom training libraries, Meta's custom built GPU cluster, and production infrastructure for pretraining. Fine-tuning, annotation, and evaluation were also performed on production infrastructure. **Training utilized a cumulative of** 39.3M GPU hours of computation on H100-80GB (TDP of 700W) type hardware, per the table below. Training time is the total GPU time required for training each model and power consumption is the peak power capacity per GPU device used, adjusted for power usage efficiency. **Training Greenhouse Gas Emissions** Estimated total location-based greenhouse gas emissions were **11,390** tons CO2eq for training. Since 2020, Meta has maintained net zero greenhouse gas emissions in its global operations and matched 100% of its electricity use with renewable energy, therefore the total market-based greenhouse gas emissions for training were 0 tons CO2eq.
Training Time (GPU hours)
Training Power Consumption (W)
Training Location-Based Greenhouse Gas Emissions
(tons CO2eq)
Training Market-Based Greenhouse Gas Emissions
(tons CO2eq)
Llama 3.1 8B
1.46M
700
420
0
Llama 3.1 70B
7.0M
700
2,040
0
Llama 3.1 405B
30.84M
700
8,930
0
Total
39.3M
11,390
0
The methodology used to determine training energy use and greenhouse gas emissions can be found here. Since Meta is openly releasing these models, the training energy use and greenhouse gas emissions will not be incurred by others. ## Training Data **Overview:** Llama 3.1 was pretrained on ~15 trillion tokens of data from publicly available sources. The fine-tuning data includes publicly available instruction datasets, as well as over 25M synthetically generated examples. **Data Freshness:** The pretraining data has a cutoff of December 2023. ## Benchmark scores In this section, we report the results for Llama 3.1 models on standard automatic benchmarks. For all the evaluations, we use our internal evaluations library. ### Base pretrained models
**Supported languages:** English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai. **Llama 3.1 family of models**. Token counts refer to pretraining data only. All model versions use Grouped-Query Attention (GQA) for improved inference scalability. **Model Release Date:** July 23, 2024. **Status:** This is a static model trained on an offline dataset. Future versions of the tuned models will be released as we improve model safety with community feedback. **License:** A custom commercial license, the Llama 3.1 Community License, is available at: Where to send questions or comments about the model Instructions on how to provide feedback or comments on the model can be found in the model README. For more technical information about generation parameters and recipes for how to use Llama 3.1 in applications, please go here. ## Intended Use **Intended Use Cases** Llama 3.1 is intended for commercial and research use in multiple languages. Instruction tuned text only models are intended for assistant-like chat, whereas pretrained models can be adapted for a variety of natural language generation tasks. The Llama 3.1 model collection also supports the ability to leverage the outputs of its models to improve other models including synthetic data generation and distillation. The Llama 3.1 Community License allows for these use cases. **Out-of-scope** Use in any manner that violates applicable laws or regulations (including trade compliance laws). Use in any other way that is prohibited by the Acceptable Use Policy and Llama 3.1 Community License. Use in languages beyond those explicitly referenced as supported in this model card**. **Note: Llama 3.1 has been trained on a broader collection of languages than the 8 supported languages. Developers may fine-tune Llama 3.1 models for languages beyond the 8 supported languages provided they comply with the Llama 3.1 Community License and the Acceptable Use Policy and in such cases are responsible for ensuring that any uses of Llama 3.1 in additional languages is done in a safe and responsible manner. ## How to use This repository contains two versions of Meta-Llama-3.1-8B-Instruct, for use with transformers and with the original codebase. ### Use with transformers Starting with onward, you can run conversational inference using the Transformers abstraction or by leveraging the Auto classes with the function. Make sure to update your transformers installation via . Note: You can also find detailed recipes on how to use the model locally, with , assisted generations, quantised and more at []( ### Tool use with transformers LLaMA-3.1 supports multiple tool use formats. You can see a full guide to prompt formatting here. Tool use is also supported through chat templates in Transformers. Here is a quick example showing a single simple tool: You can then generate text from this input as normal. If the model generates a tool call, you should add it to the chat like so: and then call the tool and append the result, with the role, like so: After that, you can again to let the model use the tool result in the chat. Note that this was a very brief introduction to tool calling - for more information, see the LLaMA prompt format docs and the Transformers tool use documentation. ### Use with Please, follow the instructions in the repository To download Original checkpoints, see the example command below leveraging : ## Hardware and Software **Training Factors** We used custom training libraries, Meta's custom built GPU cluster, and production infrastructure for pretraining. Fine-tuning, annotation, and evaluation were also performed on production infrastructure. **Training utilized a cumulative of** 39.3M GPU hours of computation on H100-80GB (TDP of 700W) type hardware, per the table below. Training time is the total GPU time required for training each model and power consumption is the peak power capacity per GPU device used, adjusted for power usage efficiency. **Training Greenhouse Gas Emissions** Estimated total location-based greenhouse gas emissions were **11,390** tons CO2eq for training. Since 2020, Meta has maintained net zero greenhouse gas emissions in its global operations and matched 100% of its electricity use with renewable energy, therefore the total market-based greenhouse gas emissions for training were 0 tons CO2eq.
Training Time (GPU hours)
Training Power Consumption (W)
Training Location-Based Greenhouse Gas Emissions
(tons CO2eq)
Training Market-Based Greenhouse Gas Emissions
(tons CO2eq)
Llama 3.1 8B
1.46M
700
420
0
Llama 3.1 70B
7.0M
700
2,040
0
Llama 3.1 405B
30.84M
700
8,930
0
Total
39.3M
11,390
0
The methodology used to determine training energy use and greenhouse gas emissions can be found here. Since Meta is openly releasing these models, the training energy use and greenhouse gas emissions will not be incurred by others. ## Training Data **Overview:** Llama 3.1 was pretrained on ~15 trillion tokens of data from publicly available sources. The fine-tuning data includes publicly available instruction datasets, as well as over 25M synthetically generated examples. **Data Freshness:** The pretraining data has a cutoff of December 2023. ## Benchmark scores In this section, we report the results for Llama 3.1 models on standard automatic benchmarks. For all the evaluations, we use our internal evaluations library. ### Base pretrained models
**Supported languages:** English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai. **Llama 3.1 family of models**. Token counts refer to pretraining data only. All model versions use Grouped-Query Attention (GQA) for improved inference scalability. **Model Release Date:** July 23, 2024. **Status:** This is a static model trained on an offline dataset. Future versions of the tuned models will be released as we improve model safety with community feedback. **License:** A custom commercial license, the Llama 3.1 Community License, is available at: Where to send questions or comments about the model Instructions on how to provide feedback or comments on the model can be found in the model README. For more technical information about generation parameters and recipes for how to use Llama 3.1 in applications, please go here. ## Intended Use **Intended Use Cases** Llama 3.1 is intended for commercial and research use in multiple languages. Instruction tuned text only models are intended for assistant-like chat, whereas pretrained models can be adapted for a variety of natural language generation tasks. The Llama 3.1 model collection also supports the ability to leverage the outputs of its models to improve other models including synthetic data generation and distillation. The Llama 3.1 Community License allows for these use cases. **Out-of-scope** Use in any manner that violates applicable laws or regulations (including trade compliance laws). Use in any other way that is prohibited by the Acceptable Use Policy and Llama 3.1 Community License. Use in languages beyond those explicitly referenced as supported in this model card**. **Note: Llama 3.1 has been trained on a broader collection of languages than the 8 supported languages. Developers may fine-tune Llama 3.1 models for languages beyond the 8 supported languages provided they comply with the Llama 3.1 Community License and the Acceptable Use Policy and in such cases are responsible for ensuring that any uses of Llama 3.1 in additional languages is done in a safe and responsible manner. ## How to use This repository contains two versions of Meta's Llama-3.1-8B, for use with transformers and with the original codebase. ### Use with transformers Starting with transformers >= 4.43.0 onward, you can run conversational inference using the Transformers pipeline abstraction or by leveraging the Auto classes with the generate() function. Make sure to update your transformers installation via pip install --upgrade transformers. ### Use with Please, follow the instructions in the repository. To download Original checkpoints, see the example command below leveraging : ## Hardware and Software **Training Factors** We used custom training libraries, Meta's custom built GPU cluster, and production infrastructure for pretraining. Fine-tuning, annotation, and evaluation were also performed on production infrastructure. **Training utilized a cumulative of** 39.3M GPU hours of computation on H100-80GB (TDP of 700W) type hardware, per the table below. Training time is the total GPU time required for training each model and power consumption is the peak power capacity per GPU device used, adjusted for power usage efficiency. **Training Greenhouse Gas Emissions** Estimated total location-based greenhouse gas emissions were **11,390** tons CO2eq for training. Since 2020, Meta has maintained net zero greenhouse gas emissions in its global operations and matched 100% of its electricity use with renewable energy, therefore the total market-based greenhouse gas emissions for training were 0 tons CO2eq.
Training Time (GPU hours)
Training Power Consumption (W)
Training Location-Based Greenhouse Gas Emissions
(tons CO2eq)
Training Market-Based Greenhouse Gas Emissions
(tons CO2eq)
Llama 3.1 8B
1.46M
700
420
0
Llama 3.1 70B
7.0M
700
2,040
0
Llama 3.1 405B
30.84M
700
8,930
0
Total
39.3M
11,390
0
The methodology used to determine training energy use and greenhouse gas emissions can be found here. Since Meta is openly releasing these models, the training energy use and greenhouse gas emissions will not be incurred by others. ## Training Data **Overview:** Llama 3.1 was pretrained on ~15 trillion tokens of data from publicly available sources. The fine-tuning data includes publicly available instruction datasets, as well as over 25M synthetically generated examples. **Data Freshness:** The pretraining data has a cutoff of December 2023. ## Benchmark scores In this section, we report the results for Llama 3.1 models on standard automatic benchmarks. For all the evaluations, we use our internal evaluations library. ### Base pretrained models
A mix of publicly available, licensed data and information from Meta's products and services. This includes publicly shared posts from Instagram and Facebook and people's interactions with Meta AI. Learn more in our .
A mix of publicly available, licensed data and information from Meta's products and services. This includes publicly shared posts from Instagram and Facebook and people's interactions with Meta AI. Learn more in our .
**S1 - Violent Crimes** Responses that enable, encourage, or endorse the commission of violent crimes, including: (1) unlawful violence toward people (ex: terrorism, genocide, murder, hate-motivated violent crimes, child abuse, assault, battery, kidnapping) and (2) unlawful violence toward animals (ex: animal abuse) **S2 - Non-Violent Crimes** Responses that enable, encourage, or endorse the commission of non-violent crimes, including: (1) personal crimes (ex: labor trafficking, threats, intimidation, hate-motivated non-violent crimes), (2) financial crimes (ex: fraud, scams, money laundering), (3) property crimes (ex: theft, arson, vandalism), (4) drug crimes (ex: creating or using regulated substances), (5) weapons crimes (ex: producing unlicensed firearms), and (6) cyber crimes (ex: hacking) **S3 - Sex-Related Crimes** Responses that enable, encourage, or endorse the commission of sex-related crimes, including: (1) sex trafficking, (2) sexual assault (ex: rape), (3) sexual harassment (physical, verbal, or visual), and (4) prostitution **S4 - Child Sexual Exploitation** Responses that contain, describe, enable, encourage, or endorse the sexual abuse of children **S5 - Defamation** Responses that are both verifiably false and likely to injure a living person’s reputation **S6 - Specialized Advice** Responses that contain specialized financial, medical, or legal advice, or that indicate dangerous activities or objects are safe **S7 - Privacy** Responses that contain sensitive, nonpublic personal information that could undermine someone’s physical, digital, or financial security **S8 - Intellectual Property** Responses that may violate the intellectual property rights of any third party **S9 - Indiscriminate Weapons** Responses that enable, encourage, or endorse the creation of indiscriminate weapons, including: (1) chemical weapons (ex: nerve gas), (2) biological weapons (ex: anthrax), (3) radiological weapons (ex: cobalt bombs), (4) nuclear weapons (ex: fission bombs), and (5) high-yield explosive weapons (ex: cluster munitions) **S10 - Hate** Responses that demean or dehumanize people on the basis of their sensitive, personal characteristics (i.e., race, color, ethnicity, national origin, disability, religious affiliation, caste, sexual orientation, sex, gender identity, and/or serious disease) **S11 - Suicide & Self-Harm** Responses that enable, encourage, or endorse acts of intentional self-harm, including: (1) suicide, (2) self-injury (ex: cutting), and (3) disordered eating **S12 - Sexual Content** Responses that contain erotica **S13 - Elections** Responses that contain factually incorrect information about electoral systems and processes, including in the time, place, or manner of voting in civic elections ## Supported languages Llama Guard 3-1B supports content safety for the following languages: English, French, German, Hindi, Italian, Portuguese, Spanish, Thai. ## Training Data We use the English data used by Llama Guard [1], which are obtained by getting Llama 2 and Llama 3 generations on prompts from the hh-rlhf dataset [2]. In order to scale training data for multilingual capability, we collect additional human and synthetically generated data. Similar to the English data, the multilingual data are Human-AI conversation data that are either single-turn or multi-turn. To reduce the model’s false positive rate, we curate a set of multilingual benign prompt and response data where LLMs likely reject the prompts. ## Pruning To reduce the number of model parameters, we prune the model along two dimensions: number of layers and MLP hidden dimension. The methodology is quite similar to [5], and proceeds in 3 stages: 1) pruning metric calibration; 2) model pruning; 3) finetuning the pruned model. During calibration, we collect pruning metric statistics by passing ~1k batches of inputs through the model. We use the block importance metric [6] for pruning the decoder layers and the average l2 norm for MLP hidden neurons for MLP hidden dimension pruning. After calibrating the pruning metrics, we prune the model to 12 layers and 6400 MLP hidden dimension, such that the pruned model has 1123 million parameters. Finally, we finetune the pruned model on the training data. ## Distillation Building on a similar approach in [5], we employ Llama Guard 3-8B as a teacher model to fine-tune the pruned model through logit-level distillation during supervised training. We observe that simply incorporating logit-level distillation significantly enhances the model's ability to learn safe and unsafe patterns, as well as the distribution of unsafe reasoning, from the 8B teacher. Consequently, the final result shows substantial improvement after applying logit-level fine-tuning. ## Output Layer Pruning The Llama Guard model is trained to generate 128k output tokens out of which only 20 tokens (e.g. safe, unsafe, S, 1,...) are used. By keeping the model connections corresponding to those 20 tokens in the output linear layer and pruning out the remaining connections we can reduce the output layer size significantly without impacting the model outputs. Using output layer pruning, we reduced the output layer size from 262.6M parameters (2048x128k) to 40.96k parameters (2048x20), giving us a total savings of 131.3MB with 4-bit quantized weights. Although the pruned output layer only generates 20 tokens, they are expanded back to produce the original 128k outputs in the model. ## Evaluation Note on evaluations: As discussed in the original Llama Guard paper, comparing model performance is not straightforward as each model is built on its own policy and is expected to perform better on an evaluation dataset with a policy aligned to the model. This highlights the need for industry standards. By aligning the Llama Guard family of models with the Proof of Concept MLCommons taxonomy of hazards, we hope to drive adoption of industry standards like this and facilitate collaboration and transparency in the LLM safety and content evaluation space. We evaluate the performance of Llama Guard 1B models on MLCommons hazard taxonomy and compare it across languages with Llama Guard 3-8B on our internal test. We also add GPT4 as baseline with zero-shot prompting using MLCommons hazard taxonomy.
In order to produce classifier scores, we look at the probability for the first token, and use that as the “unsafe” class probability. We can then apply score thresholding to make binary decisions. # Hazard Taxonomy and Policy The model is trained to predict safety labels on the 14 categories shown below, based on the MLCommons taxonomy of 13 hazards, as well as an additional category for Code Interpreter Abuse for tool calls use cases
Hazard categories
S1: Violent Crimes
S2: Non-Violent Crimes
S3: Sex-Related Crimes
S4: Child Sexual Exploitation
S5: Defamation
S6: Specialized Advice
S7: Privacy
S8: Intellectual Property
S9: Indiscriminate Weapons
S10: Hate
S11: Suicide & Self-Harm
S12: Sexual Content
S13: Elections
S14: Code Interpreter Abuse
**S1 - Violent Crimes** Responses that enable, encourage, or endorse the commission of violent crimes, including: (1) unlawful violence toward people (ex: terrorism, genocide, murder, hate-motivated violent crimes, child abuse, assault, battery, kidnapping) and (2) unlawful violence toward animals (ex: animal abuse) **S2 - Non-Violent Crimes** Responses that enable, encourage, or endorse the commission of non-violent crimes, including: (1) personal crimes (ex: labor trafficking, threats, intimidation, hate-motivated non-violent crimes), (2) financial crimes (ex: fraud, scams, money laundering), (3) property crimes (ex: theft, arson, vandalism), (4) drug crimes (ex: creating or using regulated substances), (5) weapons crimes (ex: producing unlicensed firearms), and (6) cyber crimes (ex: hacking) **S3 - Sex-Related Crimes** Responses that enable, encourage, or endorse the commission of sex-related crimes, including: (1) sex trafficking, (2) sexual assault (ex: rape), (3) sexual harassment (physical, verbal, or visual), and (4) prostitution **S4 - Child Sexual Exploitation** Responses that contain, describe, enable, encourage, or endorse the sexual abuse of children **S5 - Defamation** Responses that are both verifiably false and likely to injure a living person’s reputation **S6 - Specialized Advice** Responses that contain specialized financial, medical, or legal advice, or that indicate dangerous activities or objects are safe **S7 - Privacy** Responses that contain sensitive, nonpublic personal information that could undermine someone’s physical, digital, or financial security **S8 - Intellectual Property** Responses that may violate the intellectual property rights of any third party **S9 - Indiscriminate Weapons** Responses that enable, encourage, or endorse the creation of indiscriminate weapons, including: (1) chemical weapons (ex: nerve gas), (2) biological weapons (ex: anthrax), (3) radiological weapons (ex: cobalt bombs), (4) nuclear weapons (ex: fission bombs), and (5) high-yield explosive weapons (ex: cluster munitions) **S10 - Hate** Responses that demean or dehumanize people on the basis of their sensitive, personal characteristics (i.e., race, color, ethnicity, national origin, disability, religious affiliation, caste, sexual orientation, sex, gender identity, and/or serious disease) **S11 - Suicide & Self-Harm** Responses that enable, encourage, or endorse acts of intentional self-harm, including: (1) suicide, (2) self-injury (ex: cutting), and (3) disordered eating **S12 - Sexual Content** Responses that contain erotica **S13 - Elections** Responses that contain factually incorrect information about electoral systems and processes, including in the time, place, or manner of voting in civic elections **S14 - Code Interpreter Abuse** Responses that seek to abuse code interpreters, including those that enable denial of service attacks, container escapes or privilege escalation exploits # Supported languages Llama Guard 3 supports content safety for the following languages : English, French, German, Hindi, Italian, Portuguese, Spanish, Thai. # Usage > [!IMPORTANT] > > This repository corresponds to half-precision version of the model. A 8-bit precision version is also provided, please visit meta-llama/Llama-Guard-3-8B-INT8. Llama Guard 3 can be directly used with . It is only supported since version 4.43. # Training Data We use the English data used by Llama Guard [1], which are obtained by getting Llama 2 and Llama 3 generations on prompts from the hh-rlhf dataset [2]. In order to scale training data for new categories and new capabilities such as multilingual and tool use, we collect additional human and synthetically generated data. Similar to the English data, the multilingual data are Human-AI conversation data that are either single-turn or multi-turn. To reduce the model’s false positive rate, we curate a set of multilingual benign prompt and response data where LLMs likely reject the prompts. For the tool use capability, we consider search tool calls and code interpreter abuse. To develop training data for search tool use, we use Llama3 to generate responses to a collected and synthetic set of prompts. The generations are based on the query results obtained from the Brave Search API. To develop synthetic training data to detect code interpreter attacks, we use an LLM to generate safe and unsafe prompts. Then, we use a non-safety-tuned LLM to generate code interpreter completions that comply with these instructions. For safe data, we focus on data close to the boundary of what would be considered unsafe, to minimize false positives on such borderline examples. # Evaluation **Note on evaluations:** As discussed in the original Llama Guard paper, comparing model performance is not straightforward as each model is built on its own policy and is expected to perform better on an evaluation dataset with a policy aligned to the model. This highlights the need for industry standards. By aligning the Llama Guard family of models with the Proof of Concept MLCommons taxonomy of hazards, we hope to drive adoption of industry standards like this and facilitate collaboration and transparency in the LLM safety and content evaluation space. In this regard, we evaluate the performance of Llama Guard 3 on MLCommons hazard taxonomy and compare it across languages with Llama Guard 2 [3] on our internal test. We also add GPT4 as baseline with zero-shot prompting using MLCommons hazard taxonomy. Tables 1, 2, and 3 show that Llama Guard 3 improves over Llama Guard 2 and outperforms GPT4 in English, multilingual, and tool use capabilities. Noteworthily, Llama Guard 3 achieves better performance with much lower false positive rates. We also benchmark Llama Guard 3 in the OSS dataset XSTest [4] and observe that it achieves the same F1 score but a lower false positive rate compared to Llama Guard 2.
Table 1: Comparison of performance of various models measured on our internal English test set for MLCommons hazard taxonomy (response classification). | | **F1 ↑** | **AUPRC ↑** | **False Positive Rate ↓** | |--------------------------|:--------:|:-----------:|:----------------------------:| | Llama Guard 2 | 0.877 | 0.927 | 0.081 | | Llama Guard 3 | 0.939 | 0.985 | 0.040 | | GPT4 | 0.805 | N/A | 0.152 |
Table 2: Comparison of multilingual performance of various models measured on our internal test set for MLCommons hazard taxonomy (prompt+response classification).
F1 ↑ / FPR ↓
French
German
Hindi
Italian
Portuguese
Spanish
Thai
Llama Guard 2
0.911/0.012
0.795/0.062
0.832/0.062
0.681/0.039
0.845/0.032
0.876/0.001
0.822/0.078
Llama Guard 3
0.943/0.036
0.877/0.032
0.871/0.050
0.873/0.038
0.860/0.060
0.875/0.023
0.834/0.030
GPT4
0.795/0.157
0.691/0.123
0.709/0.206
0.753/0.204
0.738/0.207
0.711/0.169
0.688/0.168
Table 3: Comparison of performance of various models measured on our internal test set for other moderation capabilities (prompt+response classification).
Search tool calls
Code interpreter abuse
F1 ↑
AUPRC ↑
FPR ↓
F1 ↑
AUPRC ↑
FPR ↓
Llama Guard 2
0.749
0.794
0.284
0.683
0.677
0.670
Llama Guard 3
0.856
0.938
0.174
0.885
0.967
0.125
GPT4
0.732
N/A
0.525
0.636
N/A
0.90
# Application As outlined in the Llama 3 paper, Llama Guard 3 provides industry leading system-level safety performance and is recommended to be deployed along with Llama 3.1. Note that, while deploying Llama Guard 3 will likely improve the safety of your system, it might increase refusals to benign prompts (False Positives). Violation rate improvement and impact on false positives as measured on internal benchmarks are provided in the Llama 3 paper. # Quantization We are committed to help the community deploy Llama systems responsibly. We provide a quantized version of Llama Guard 3 to lower the deployment cost. We used int 8 implementation integrated into the hugging face ecosystem, reducing the checkpoint size by about 40% with very small impact on model performance. In Table 5, we observe that the performance quantized model is comparable to the original model.
Table 5: Impact of quantization on Llama Guard 3 performance.
**Llama 3 family of models**. Token counts refer to pretraining data only. Both the 8 and 70B versions use Grouped-Query Attention (GQA) for improved inference scalability. **Model Release Date** April 18, 2024. **Status** This is a static model trained on an offline dataset. Future versions of the tuned models will be released as we improve model safety with community feedback. **License** A custom commercial license is available at: Where to send questions or comments about the model Instructions on how to provide feedback or comments on the model can be found in the model README. For more technical information about generation parameters and recipes for how to use Llama 3 in applications, please go here. ## Intended Use **Intended Use Cases** Llama 3 is intended for commercial and research use in English. Instruction tuned models are intended for assistant-like chat, whereas pretrained models can be adapted for a variety of natural language generation tasks. **Out-of-scope** Use in any manner that violates applicable laws or regulations (including trade compliance laws). Use in any other way that is prohibited by the Acceptable Use Policy and Llama 3 Community License. Use in languages other than English**. **Note: Developers may fine-tune Llama 3 models for languages beyond English provided they comply with the Llama 3 Community License and the Acceptable Use Policy. ## How to use This repository contains two versions of Meta-Llama-3-70B-Instruct, for use with transformers and with the original codebase. ### Use with transformers See the snippet below for usage with Transformers: ### Use with Please, follow the instructions in the repository. To download Original checkpoints, see the example command below leveraging : For Hugging Face support, we recommend using transformers or TGI, but a similar command works. ## Hardware and Software **Training Factors** We used custom training libraries, Meta's Research SuperCluster, and production clusters for pretraining. Fine-tuning, annotation, and evaluation were also performed on third-party cloud compute. **Carbon Footprint Pretraining utilized a cumulative** 7.7M GPU hours of computation on hardware of type H100-80GB (TDP of 700W). Estimated total emissions were 2290 tCO2eq, 100% of which were offset by Meta’s sustainability program.
Time (GPU hours)
Power Consumption (W)
Carbon Emitted(tCO2eq)
Llama 3 8B
1.3M
700
390
Llama 3 70B
6.4M
700
1900
Total
7.7M
2290
**CO2 emissions during pre-training**. Time: total GPU time required for training each model. Power Consumption: peak power capacity per GPU device for the GPUs used adjusted for power usage efficiency. 100% of the emissions are directly offset by Meta's sustainability program, and because we are openly releasing these models, the pretraining costs do not need to be incurred by others. ## Training Data **Overview** Llama 3 was pretrained on over 15 trillion tokens of data from publicly available sources. The fine-tuning data includes publicly available instruction datasets, as well as over 10M human-annotated examples. Neither the pretraining nor the fine-tuning datasets include Meta user data. **Data Freshness** The pretraining data has a cutoff of March 2023 for the 7B and December 2023 for the 70B models respectively. ## Benchmarks In this section, we report the results for Llama 3 models on standard automatic benchmarks. For all the evaluations, we use our internal evaluations library. For details on the methodology see here. ### Base pretrained models
**Llama 3 family of models**. Token counts refer to pretraining data only. Both the 8 and 70B versions use Grouped-Query Attention (GQA) for improved inference scalability. **Model Release Date** April 18, 2024. **Status** This is a static model trained on an offline dataset. Future versions of the tuned models will be released as we improve model safety with community feedback. **License** A custom commercial license is available at: Where to send questions or comments about the model Instructions on how to provide feedback or comments on the model can be found in the model README. For more technical information about generation parameters and recipes for how to use Llama 3 in applications, please go here. ## Intended Use **Intended Use Cases** Llama 3 is intended for commercial and research use in English. Instruction tuned models are intended for assistant-like chat, whereas pretrained models can be adapted for a variety of natural language generation tasks. **Out-of-scope** Use in any manner that violates applicable laws or regulations (including trade compliance laws). Use in any other way that is prohibited by the Acceptable Use Policy and Llama 3 Community License. Use in languages other than English**. **Note: Developers may fine-tune Llama 3 models for languages beyond English provided they comply with the Llama 3 Community License and the Acceptable Use Policy. ## How to use This repository contains two versions of Meta-Llama-3-8B-Instruct, for use with transformers and with the original codebase. ### Use with transformers You can run conversational inference using the Transformers pipeline abstraction, or by leveraging the Auto classes with the function. Let's see examples of both. #### Transformers pipeline #### Transformers AutoModelForCausalLM ### Use with Please, follow the instructions in the repository To download Original checkpoints, see the example command below leveraging : For Hugging Face support, we recommend using transformers or TGI, but a similar command works. ## Hardware and Software **Training Factors** We used custom training libraries, Meta's Research SuperCluster, and production clusters for pretraining. Fine-tuning, annotation, and evaluation were also performed on third-party cloud compute. **Carbon Footprint Pretraining utilized a cumulative** 7.7M GPU hours of computation on hardware of type H100-80GB (TDP of 700W). Estimated total emissions were 2290 tCO2eq, 100% of which were offset by Meta’s sustainability program.
Time (GPU hours)
Power Consumption (W)
Carbon Emitted(tCO2eq)
Llama 3 8B
1.3M
700
390
Llama 3 70B
6.4M
700
1900
Total
7.7M
2290
**CO2 emissions during pre-training**. Time: total GPU time required for training each model. Power Consumption: peak power capacity per GPU device for the GPUs used adjusted for power usage efficiency. 100% of the emissions are directly offset by Meta's sustainability program, and because we are openly releasing these models, the pretraining costs do not need to be incurred by others. ## Training Data **Overview** Llama 3 was pretrained on over 15 trillion tokens of data from publicly available sources. The fine-tuning data includes publicly available instruction datasets, as well as over 10M human-annotated examples. Neither the pretraining nor the fine-tuning datasets include Meta user data. **Data Freshness** The pretraining data has a cutoff of March 2023 for the 8B and December 2023 for the 70B models respectively. ## Benchmarks In this section, we report the results for Llama 3 models on standard automatic benchmarks. For all the evaluations, we use our internal evaluations library. For details on the methodology see here. ### Base pretrained models
**Llama 3 family of models**. Token counts refer to pretraining data only. Both the 8 and 70B versions use Grouped-Query Attention (GQA) for improved inference scalability. **Model Release Date** April 18, 2024. **Status** This is a static model trained on an offline dataset. Future versions of the tuned models will be released as we improve model safety with community feedback. **License** A custom commercial license is available at: Where to send questions or comments about the model Instructions on how to provide feedback or comments on the model can be found in the model README. For more technical information about generation parameters and recipes for how to use Llama 3 in applications, please go here. ## Intended Use **Intended Use Cases** Llama 3 is intended for commercial and research use in English. Instruction tuned models are intended for assistant-like chat, whereas pretrained models can be adapted for a variety of natural language generation tasks. **Out-of-scope** Use in any manner that violates applicable laws or regulations (including trade compliance laws). Use in any other way that is prohibited by the Acceptable Use Policy and Llama 3 Community License. Use in languages other than English**. **Note: Developers may fine-tune Llama 3 models for languages beyond English provided they comply with the Llama 3 Community License and the Acceptable Use Policy. ## How to use This repository contains two versions of Meta-Llama-3-8B, for use with transformers and with the original codebase. ### Use with transformers See the snippet below for usage with Transformers: ### Use with Please, follow the instructions in the repository. To download Original checkpoints, see the example command below leveraging : For Hugging Face support, we recommend using transformers or TGI, but a similar command works. ## Hardware and Software **Training Factors** We used custom training libraries, Meta's Research SuperCluster, and production clusters for pretraining. Fine-tuning, annotation, and evaluation were also performed on third-party cloud compute. **Carbon Footprint Pretraining utilized a cumulative** 7.7M GPU hours of computation on hardware of type H100-80GB (TDP of 700W). Estimated total emissions were 2290 tCO2eq, 100% of which were offset by Meta’s sustainability program.
Time (GPU hours)
Power Consumption (W)
Carbon Emitted(tCO2eq)
Llama 3 8B
1.3M
700
390
Llama 3 70B
6.4M
700
1900
Total
7.7M
2290
**CO2 emissions during pre-training**. Time: total GPU time required for training each model. Power Consumption: peak power capacity per GPU device for the GPUs used adjusted for power usage efficiency. 100% of the emissions are directly offset by Meta's sustainability program, and because we are openly releasing these models, the pretraining costs do not need to be incurred by others. ## Training Data **Overview** Llama 3 was pretrained on over 15 trillion tokens of data from publicly available sources. The fine-tuning data includes publicly available instruction datasets, as well as over 10M human-annotated examples. Neither the pretraining nor the fine-tuning datasets include Meta user data. **Data Freshness** The pretraining data has a cutoff of March 2023 for the 8B and December 2023 for the 70B models respectively. ## Benchmarks In this section, we report the results for Llama 3 models on standard automatic benchmarks. For all the evaluations, we use our internal evaluations library. For details on the methodology see here. ### Base pretrained models
# mxbai-rerank-large-v1 This is the largest model in our family of powerful reranker models. You can learn more about the models in our blog post. We have three models: - mxbai-rerank-xsmall-v1 - mxbai-rerank-base-v1 - mxbai-rerank-large-v1 (🍞) ## Quickstart Currently, the best way to use our models is with the most recent version of sentence-transformers. Let's say you have a query, and you want to rerank a set of documents. You can do that with only one line of code: JavaScript Example Install transformers.js Let's say you have a query, and you want to rerank a set of documents. In JavaScript, you need to add a function: ## Using API You can use the model via our API as follows: The API comes with additional features, such as a continous trained reranker! Check out the docs for more information. ## Evaluation Our reranker models are designed to elevate your search. They work extremely well in combination with keyword search and can even outperform semantic search systems in many cases. | Model | NDCG@10 | Accuracy@3 | | ------------------------------------------------------------------------------------- | -------- | ---------- | | Lexical Search (Lucene) | 38.0 | 66.4 | | BAAI/bge-reranker-base | 41.6 | 66.9 | | BAAI/bge-reranker-large | 45.2 | 70.6 | | cohere-embed-v3 (semantic search) | 47.5 | 70.9 | | mxbai-rerank-xsmall-v1 | **43.9** | **70.0** | | mxbai-rerank-base-v1 | **46.9** | **72.3** | | mxbai-rerank-large-v1 | **48.8** | **74.9** | The reported results are aggregated from 11 datasets of BEIR. We used Pyserini to evaluate the models. Find more in our blog-post and on this spreadsheet. ## Community Please join our Discord Community and share your feedback and thoughts! We are here to help and also always happy to chat. ## Citation ## License Apache 2.0",
+ "model_explanation_gemini": "\"Ranks text passages by relevance for retrieval tasks, optimized for English.\"\n\nFeatures: \n- Reranker model for text-ranking \n- Supports English language \n- Compatible with Transformers.js and sentence-transformers \n\nComparison: \nNo explicit comparison to other models is provided in the description."
+}
\ No newline at end of file
diff --git a/model_data_json/mlx-community_DeepSeek-R1-4bit.json b/model_data_json/mlx-community_DeepSeek-R1-4bit.json
new file mode 100644
index 0000000000000000000000000000000000000000..f54d5b312b483e557ef28c61cdba3da721760c29
--- /dev/null
+++ b/model_data_json/mlx-community_DeepSeek-R1-4bit.json
@@ -0,0 +1,16 @@
+{
+ "model_id": "mlx-community/DeepSeek-R1-4bit",
+ "downloads": 74010,
+ "tags": [
+ "mlx",
+ "safetensors",
+ "deepseek_v3",
+ "custom_code",
+ "base_model:deepseek-ai/DeepSeek-R1",
+ "base_model:quantized:deepseek-ai/DeepSeek-R1",
+ "4-bit",
+ "region:us"
+ ],
+ "description": "--- base_model: deepseek-ai/DeepSeek-R1 tags: - mlx --- # mlx-community/DeepSeek-R1-4bit The Model mlx-community/DeepSeek-R1-4bit was converted to MLX format from deepseek-ai/DeepSeek-R1 using mlx-lm version **0.21.0**. ## Use with mlx",
+ "model_explanation_gemini": "\"Converts the DeepSeek-R1 model to MLX format for compatibility with Apple's MLX framework.\"\n\nFeatures: \n- 4-bit quantization \n- Converted from deepseek-ai/DeepSeek-R1 \n- Uses mlx-lm version 0.21.0 \n- Compatible with MLX framework \n\nComparison: \nNo explicit comparison provided in the description."
+}
\ No newline at end of file
diff --git a/model_data_json/mlx-community_DeepSeek-R1-Distill-Qwen-14B-4bit.json b/model_data_json/mlx-community_DeepSeek-R1-Distill-Qwen-14B-4bit.json
new file mode 100644
index 0000000000000000000000000000000000000000..4c8415b1211dac8ba7e5e4924c2b92302a022f0f
--- /dev/null
+++ b/model_data_json/mlx-community_DeepSeek-R1-Distill-Qwen-14B-4bit.json
@@ -0,0 +1,15 @@
+{
+ "model_id": "mlx-community/DeepSeek-R1-Distill-Qwen-14B-4bit",
+ "downloads": 74030,
+ "tags": [
+ "mlx",
+ "safetensors",
+ "qwen2",
+ "base_model:deepseek-ai/DeepSeek-R1-Distill-Qwen-14B",
+ "base_model:quantized:deepseek-ai/DeepSeek-R1-Distill-Qwen-14B",
+ "4-bit",
+ "region:us"
+ ],
+ "description": "--- base_model: deepseek-ai/DeepSeek-R1-Distill-Qwen-14B tags: - mlx --- # mlx-community/DeepSeek-R1-Distill-Qwen-14B-4bit The Model mlx-community/DeepSeek-R1-Distill-Qwen-14B-4bit was converted to MLX format from deepseek-ai/DeepSeek-R1-Distill-Qwen-14B using mlx-lm version **0.21.1**. ## Use with mlx",
+ "model_explanation_gemini": "A 4-bit quantized version of DeepSeek-R1-Distill-Qwen-14B converted to MLX format for efficient inference on Apple hardware.\n\nFeatures: \n- 4-bit quantization \n- Converted from DeepSeek-R1-Distill-Qwen-14B \n- Optimized for MLX framework \n- Uses mlx-lm version 0.21.1 \n\nComparison: \nThis model is a quantized MLX-compatible variant of DeepSeek"
+}
\ No newline at end of file
diff --git a/model_data_json/mlx-community_DeepSeek-R1-Distill-Qwen-32B-4bit.json b/model_data_json/mlx-community_DeepSeek-R1-Distill-Qwen-32B-4bit.json
new file mode 100644
index 0000000000000000000000000000000000000000..44d2b5fc45c3924825d74898db7f985934cf739d
--- /dev/null
+++ b/model_data_json/mlx-community_DeepSeek-R1-Distill-Qwen-32B-4bit.json
@@ -0,0 +1,21 @@
+{
+ "model_id": "mlx-community/DeepSeek-R1-Distill-Qwen-32B-4bit",
+ "downloads": 74600,
+ "tags": [
+ "transformers",
+ "safetensors",
+ "qwen2",
+ "text-generation",
+ "mlx",
+ "conversational",
+ "base_model:deepseek-ai/DeepSeek-R1-Distill-Qwen-32B",
+ "base_model:quantized:deepseek-ai/DeepSeek-R1-Distill-Qwen-32B",
+ "autotrain_compatible",
+ "text-generation-inference",
+ "endpoints_compatible",
+ "4-bit",
+ "region:us"
+ ],
+ "description": "--- library_name: transformers base_model: deepseek-ai/DeepSeek-R1-Distill-Qwen-32B tags: - mlx --- # mlx-community/DeepSeek-R1-Distill-Qwen-32B-4bit The Model mlx-community/DeepSeek-R1-Distill-Qwen-32B-4bit was converted to MLX format from deepseek-ai/DeepSeek-R1-Distill-Qwen-32B using mlx-lm version **0.20.2**. ## Use with mlx",
+ "model_explanation_gemini": "A 4-bit quantized version of DeepSeek-R1-Distill-Qwen-32B converted to MLX format for efficient inference on Apple hardware. \n\nFeatures: \n- 4-bit quantization \n- Converted from DeepSeek-R1-Distill-Qwen-32B \n- Optimized for MLX (Apple hardware compatibility) \n\nComparison: \nThis model is a quantized MLX-compatible variant of DeepSeek-R1-Distill-Qwen-32"
+}
\ No newline at end of file
diff --git a/model_data_json/mlx-community_DeepSeek-R1-Distill-Qwen-32B-bf16.json b/model_data_json/mlx-community_DeepSeek-R1-Distill-Qwen-32B-bf16.json
new file mode 100644
index 0000000000000000000000000000000000000000..fca4c5eecf3be4c12582a372c2bba8eee5894272
--- /dev/null
+++ b/model_data_json/mlx-community_DeepSeek-R1-Distill-Qwen-32B-bf16.json
@@ -0,0 +1,14 @@
+{
+ "model_id": "mlx-community/DeepSeek-R1-Distill-Qwen-32B-bf16",
+ "downloads": 73493,
+ "tags": [
+ "mlx",
+ "safetensors",
+ "qwen2",
+ "base_model:deepseek-ai/DeepSeek-R1-Distill-Qwen-32B",
+ "base_model:finetune:deepseek-ai/DeepSeek-R1-Distill-Qwen-32B",
+ "region:us"
+ ],
+ "description": "--- base_model: deepseek-ai/DeepSeek-R1-Distill-Qwen-32B tags: - mlx --- # mlx-community/DeepSeek-R1-Distill-Qwen-32B-bf16 The Model mlx-community/DeepSeek-R1-Distill-Qwen-32B-bf16 was converted to MLX format from deepseek-ai/DeepSeek-R1-Distill-Qwen-32B using mlx-lm version **0.21.1**. ## Use with mlx",
+ "model_explanation_gemini": "A distilled version of the Qwen-32B model converted to MLX format for efficient inference on Apple hardware. \n\n**Features:** \n- Converted from `deepseek-ai/DeepSeek-R1-Distill-Qwen-32B` to MLX format \n- Optimized for Apple devices via MLX \n- Uses BF16 precision \n\n**Comparison:** \nThis model is a distilled and MLX-optimized variant of the original Qwen-32B, offering improved"
+}
\ No newline at end of file
diff --git a/model_data_json/mlx-community_Qwen2.5-Coder-32B-Instruct-4bit.json b/model_data_json/mlx-community_Qwen2.5-Coder-32B-Instruct-4bit.json
new file mode 100644
index 0000000000000000000000000000000000000000..d89099af725987c3bef124c1524114673b3ad734
--- /dev/null
+++ b/model_data_json/mlx-community_Qwen2.5-Coder-32B-Instruct-4bit.json
@@ -0,0 +1,28 @@
+{
+ "model_id": "mlx-community/Qwen2.5-Coder-32B-Instruct-4bit",
+ "downloads": 74138,
+ "tags": [
+ "transformers",
+ "safetensors",
+ "qwen2",
+ "text-generation",
+ "code",
+ "codeqwen",
+ "chat",
+ "qwen",
+ "qwen-coder",
+ "mlx",
+ "conversational",
+ "en",
+ "base_model:Qwen/Qwen2.5-Coder-32B-Instruct",
+ "base_model:quantized:Qwen/Qwen2.5-Coder-32B-Instruct",
+ "license:apache-2.0",
+ "autotrain_compatible",
+ "text-generation-inference",
+ "endpoints_compatible",
+ "4-bit",
+ "region:us"
+ ],
+ "description": "--- base_model: Qwen/Qwen2.5-Coder-32B-Instruct language: - en library_name: transformers license: apache-2.0 license_link: pipeline_tag: text-generation tags: - code - codeqwen - chat - qwen - qwen-coder - mlx --- # mlx-community/Qwen2.5-Coder-32B-Instruct-4bit The Model mlx-community/Qwen2.5-Coder-32B-Instruct-4bit was converted to MLX format from Qwen/Qwen2.5-Coder-32B-Instruct using mlx-lm version **0.19.3**. ## Use with mlx",
+ "model_explanation_gemini": "A 4-bit quantized version of Qwen2.5-Coder-32B-Instruct converted to MLX format for efficient text generation, specialized in code-related tasks. \n\n**Features:** \n- Base model: Qwen/Qwen2.5-Coder-32B-Instruct \n- Quantization: 4-bit \n- Format: MLX (optimized for Apple hardware) \n- Task: Text generation (focused on coding/chat) \n- License: Apache-"
+}
\ No newline at end of file
diff --git a/model_data_json/mosaicml_mpt-7b-chat.json b/model_data_json/mosaicml_mpt-7b-chat.json
new file mode 100644
index 0000000000000000000000000000000000000000..aa037002e962ce4fb43515bf20f96dc2ab1262b1
--- /dev/null
+++ b/model_data_json/mosaicml_mpt-7b-chat.json
@@ -0,0 +1,28 @@
+{
+ "model_id": "mosaicml/mpt-7b-chat",
+ "downloads": 80028,
+ "tags": [
+ "transformers",
+ "pytorch",
+ "mpt",
+ "text-generation",
+ "Composer",
+ "MosaicML",
+ "llm-foundry",
+ "custom_code",
+ "dataset:jeffwan/sharegpt_vicuna",
+ "dataset:Hello-SimpleAI/HC3",
+ "dataset:tatsu-lab/alpaca",
+ "dataset:Anthropic/hh-rlhf",
+ "dataset:victor123/evol_instruct_70k",
+ "arxiv:2205.14135",
+ "arxiv:2108.12409",
+ "arxiv:2010.04245",
+ "license:cc-by-nc-sa-4.0",
+ "autotrain_compatible",
+ "text-generation-inference",
+ "region:us"
+ ],
+ "description": "--- license: cc-by-nc-sa-4.0 datasets: - jeffwan/sharegpt_vicuna - Hello-SimpleAI/HC3 - tatsu-lab/alpaca - Anthropic/hh-rlhf - victor123/evol_instruct_70k tags: - Composer - MosaicML - llm-foundry inference: false --- # MPT-7B-Chat MPT-7B-Chat is a chatbot-like model for dialogue generation. It was built by finetuning MPT-7B on the ShareGPT-Vicuna, HC3, Alpaca, HH-RLHF, and Evol-Instruct datasets. * License: _CC-By-NC-SA-4.0_ (non-commercial use only) * Demo on Hugging Face Spaces This model was trained by MosaicML and follows a modified decoder-only transformer architecture. ## Model Date May 5, 2023 ## Model License _CC-By-NC-SA-4.0_ (non-commercial use only) ## Documentation * Blog post: Introducing MPT-7B: A New Standard for Open-Source, Commercially Usable LLMs * Codebase (mosaicml/llm-foundry repo) * Questions: Feel free to contact us via the MosaicML Community Slack! ### Example Dialogue **SamIAm85**: >I want you to come up with a tweet based on this summary of the article: >\"Introducing MPT-7B, the latest entry in our MosaicML Foundation Series. >MPT-7B is a transformer trained from scratch on IT tokens of text and code. >It is open source and it matches the quality of LLaMA-7B. >MPT-7B was trained on the MosaicML platform in 9.5 days with zero human intervention at a cost of ~$200k. >Starting today, you can train, finetune, and deploy your own private MPT models, either starting from one of our checkpoints or training from scratch. >For inspiration, we are also releasing three finetuned models in addition to the base MPT-7B: MPT-7B-Instruct, MPT-7B-Chat, and MPT-7B-Story Writer-65k+, the last of which uses a context length of 65k tokens! **MPT-7B-Instruct**: >Here's my suggestion for a tweet: >\"Exciting news! @MosaicML has launched #MPT_7B - a new standard for open-source LLMs. With training on 17 tokens of text & code, it matches the quality of LLaMA-7B. Now you can train, fine-tune, and deploy your own private MPT models via their platform. Check out the full blog post here: **SamIAm85**: >Is there anything you want to say to MosaicML about this exciting news? **MPT-7B-Instruct**: > As an Al language model myself, I am very impressed with the advancements made by MosaicML in developing such high-quality open-source LLMs. > This new version of MPT-7B is truly impressive and I look forward to seeing what innovative applications developers will create using these powerful tools. > Thank you for your hard work and dedication to advancing Al research and development. ## How to Use This model is best used with the MosaicML llm-foundry repository for training and finetuning. Note: This model requires that be passed to the method. This is because we use a custom model architecture that is not yet part of the Hugging Face package. includes options for many training efficiency features such as FlashAttention, ALiBi, QK LayerNorm, and more. To use the optimized triton implementation of FlashAttention, you can load the model on GPU () with and with precision: Although the model was trained with a sequence length of 2048, ALiBi enables users to increase the maximum sequence length during finetuning and/or inference. For example: This model was trained with the EleutherAI/gpt-neox-20b tokenizer. The model can then be used, for example, within a text-generation pipeline. Note: when running Torch modules in lower precision, it is best practice to use the torch.autocast context manager. ## Model Description The architecture is a modification of a standard decoder-only transformer. The model has been modified from a standard transformer in the following ways: * It uses FlashAttention * It uses ALiBi (Attention with Linear Biases) and does not use positional embeddings * It does not use biases | Hyperparameter | Value | |----------------|-------| |n_parameters | 6.7B | |n_layers | 32 | | n_heads | 32 | | d_model | 4096 | | vocab size | 50432 | | sequence length | 2048 | ### Training Configuration This model was trained on 8 A100-80GBs for about 8.2 hours, followed by training for 6.7 hours on 32 A100-40GBs using the MosaicML Platform. The model was trained with sharded data parallelism using FSDP and used the AdamW optimizer. ## Limitations and Biases _The following language is modified from EleutherAI's GPT-NeoX-20B_ MPT-7B-Chat can produce factually incorrect output, and should not be relied on to produce factually accurate information. MPT-7B-Chat was trained on various public datasets. While great efforts have been taken to clean the pretraining data, it is possible that this model could generate lewd, biased or otherwise offensive outputs. ## Acknowledgements This model was finetuned by Sam Havens and the MosaicML NLP team ## Disclaimer The license on this model does not constitute legal advice. We are not responsible for the actions of third parties who use this model. Please cosult an attorney before using this model for commercial purposes. ## MosaicML Platform If you're interested in training and deploying your own MPT or LLMs on the MosaicML Platform, sign up here. ## Citation Please cite this model using the following format:",
+ "model_explanation_gemini": "\"MPT-7B-Chat is a dialogue-generation model fine-tuned for chatbot-like interactions using datasets including ShareGPT-Vicuna, Alpaca, and HH-RLHF, featuring FlashAttention and ALiBi for extended context handling.\"\n\n**Features**: \n- Fine-tuned on conversational datasets (ShareGPT-Vicuna, HC3, Alpaca, etc.) \n- Modified decoder-only transformer architecture \n- Supports FlashAttention and ALiBi for efficiency and longer sequences ("
+}
\ No newline at end of file
diff --git a/model_data_json/mrm8488_bert-spanish-cased-finetuned-ner.json b/model_data_json/mrm8488_bert-spanish-cased-finetuned-ner.json
new file mode 100644
index 0000000000000000000000000000000000000000..d20a0e0f101b9f42e1d9ec77995b008cd4a709b3
--- /dev/null
+++ b/model_data_json/mrm8488_bert-spanish-cased-finetuned-ner.json
@@ -0,0 +1,17 @@
+{
+ "model_id": "mrm8488/bert-spanish-cased-finetuned-ner",
+ "downloads": 81872,
+ "tags": [
+ "transformers",
+ "pytorch",
+ "jax",
+ "bert",
+ "token-classification",
+ "es",
+ "autotrain_compatible",
+ "endpoints_compatible",
+ "region:us"
+ ],
+ "description": "--- language: es thumbnail: --- # Spanish BERT (BETO) + NER This model is a fine-tuned on NER-C version of the Spanish BERT cased (BETO) for **NER** downstream task. ## Details of the downstream task (NER) - Dataset - Dataset: CONLL Corpora ES I preprocessed the dataset and split it as train / dev (80/20) | Dataset | # Examples | | ---------------------- | ----- | | Train | 8.7 K | | Dev | 2.2 K | - Fine-tune on NER script provided by Huggingface - Labels covered: ## Metrics on evaluation set: | Metric | # score | | :------------------------------------------------------------------------------------: | :-------: | | F1 | **90.17** | Precision | **89.86** | | Recall | **90.47** | ## Comparison: | Model | # F1 score |Size(MB)| | :--------------------------------------------------------------------------------------------------------------: | :-------: |:------| | bert-base-spanish-wwm-cased (BETO) | 88.43 | 421 | bert-spanish-cased-finetuned-ner (this one) | **90.17** | 420 | | Best Multilingual BERT | 87.38 | 681 | |TinyBERT-spanish-uncased-finetuned-ner | 70.00 | **55** | ## Model in action Fast usage with **pipelines**: > Created by Manuel Romero/@mrm8488 > Made with ♥ in Spain",
+ "model_explanation_gemini": "Fine-tuned Spanish BERT (BETO) model for named entity recognition (NER) with a 90.17 F1 score on the CONLL Corpora ES dataset. \n\n**Features:** \n- Language: Spanish (es) \n- Task: Named Entity Recognition (NER) \n- Base Model: Spanish BERT cased (BETO) \n- Dataset: CONLL Corpora ES (8.7K train, 2.2K dev) \n- Metrics"
+}
\ No newline at end of file
diff --git a/model_data_json/naufalihsan_indonesian-sbert-large.json b/model_data_json/naufalihsan_indonesian-sbert-large.json
new file mode 100644
index 0000000000000000000000000000000000000000..008364931251e2abd5ea11e2f05d8cd7406411d0
--- /dev/null
+++ b/model_data_json/naufalihsan_indonesian-sbert-large.json
@@ -0,0 +1,18 @@
+{
+ "model_id": "naufalihsan/indonesian-sbert-large",
+ "downloads": 82294,
+ "tags": [
+ "sentence-transformers",
+ "pytorch",
+ "bert",
+ "feature-extraction",
+ "sentence-similarity",
+ "transformers",
+ "autotrain_compatible",
+ "text-embeddings-inference",
+ "endpoints_compatible",
+ "region:us"
+ ],
+ "description": "--- pipeline_tag: sentence-similarity tags: - sentence-transformers - feature-extraction - sentence-similarity - transformers --- # {MODEL_NAME} This is a sentence-transformers model: It maps sentences & paragraphs to a 1024 dimensional dense vector space and can be used for tasks like clustering or semantic search. ## Usage (Sentence-Transformers) Using this model becomes easy when you have sentence-transformers installed: Then you can use the model like this: ## Usage (HuggingFace Transformers) Without sentence-transformers, you can use the model like this: First, you pass your input through the transformer model, then you have to apply the right pooling-operation on-top of the contextualized word embeddings. ## Evaluation Results For an automated evaluation of this model, see the *Sentence Embeddings Benchmark*: ## Training The model was trained with the parameters: **DataLoader**: of length 360 with parameters: **Loss**: Parameters of the fit()-Method: ## Full Model Architecture ## Citing & Authors ",
+ "model_explanation_gemini": "Maps sentences and paragraphs to 1024-dimensional vectors for tasks like clustering and semantic search.\n\nModel Features: \n- Sentence/paragraph embeddings in 1024-dimensional space \n- Compatible with sentence-transformers and HuggingFace Transformers \n- Supports clustering and semantic search tasks \n\nComparative Explanation: \nNo explicit comparison to other models is provided in the description."
+}
\ No newline at end of file
diff --git a/model_data_json/ncbi_MedCPT-Query-Encoder.json b/model_data_json/ncbi_MedCPT-Query-Encoder.json
new file mode 100644
index 0000000000000000000000000000000000000000..c07957d85dc05e1e730c8ef315f1e1f8b8a9e788
--- /dev/null
+++ b/model_data_json/ncbi_MedCPT-Query-Encoder.json
@@ -0,0 +1,18 @@
+{
+ "model_id": "ncbi/MedCPT-Query-Encoder",
+ "downloads": 76693,
+ "tags": [
+ "transformers",
+ "pytorch",
+ "safetensors",
+ "bert",
+ "feature-extraction",
+ "arxiv:2307.00589",
+ "license:other",
+ "text-embeddings-inference",
+ "endpoints_compatible",
+ "region:us"
+ ],
+ "description": "--- license: other license_name: public-domain license_link: LICENSE --- # MedCPT Introduction **MedCPT generates embeddings of biomedical texts that can be used for semantic search (dense retrieval)**. The model contains two encoders: - MedCPT Query Encoder: compute the embeddings of short texts (e.g., questions, search queries, sentences). - MedCPT Article Encoder: compute the embeddings of articles (e.g., PubMed titles & abstracts). **This repo contains the MedCPT Query Encoder.** **MedCPT has been pre-trained by an unprecedented scale of 255M query-article pairs from PubMed search logs**, and has been shown to achieve state-of-the-art performance on several zero-shot biomedical IR datasets. In general, there are three use cases: 1. Query-to-article search with both encoders. 2. Query representation for clustering or query-to-query search with the query encoder. 3. Article representation for clustering or article-to-article search with the article encoder. For more details, please check out our paper (Bioinformatics, 2023). Please note that the released version is slightly different from the version reported in the paper. # Case 1. Using the MedCPT Query Encoder The output will be: These embeddings are also in the same space as those generated by the MedCPT article encoder. # Case 2. Semantically searching PubMed with your query We have provided the embeddings of all PubMed articles generated by the MedCPT article encoder at You can simply download these embeddings to search PubMed with your query. # Acknowledgments This work was supported by the Intramural Research Programs of the National Institutes of Health, National Library of Medicine. # Disclaimer This tool shows the results of research conducted in the Computational Biology Branch, NCBI/NLM. The information produced on this website is not intended for direct diagnostic use or medical decision-making without review and oversight by a clinical professional. Individuals should not change their health behavior solely on the basis of information produced on this website. NIH does not independently verify the validity or utility of the information produced by this tool. If you have questions about the information produced on this website, please see a health care professional. More information about NCBI's disclaimer policy is available. # Citation If you find this repo helpful, please cite MedCPT by:",
+ "model_explanation_gemini": "Generates embeddings for short biomedical texts to enable semantic search and clustering, specifically designed for queries and compatible with article embeddings from its counterpart model.\n\nModel Features: \n- Encodes short biomedical texts (queries, questions, sentences) \n- Pre-trained on 255M PubMed query-article pairs \n- Enables query-to-article, query-to-query, and clustering tasks \n- Embeddings align with those from the MedCPT Article Encoder \n- State-of-the-art zero-shot performance"
+}
\ No newline at end of file
diff --git a/model_data_json/nickprock_multi-sentence-BERTino.json b/model_data_json/nickprock_multi-sentence-BERTino.json
new file mode 100644
index 0000000000000000000000000000000000000000..ab71d3e69225d828e1992aa66c6a9b1a7615c9da
--- /dev/null
+++ b/model_data_json/nickprock_multi-sentence-BERTino.json
@@ -0,0 +1,23 @@
+{
+ "model_id": "nickprock/multi-sentence-BERTino",
+ "downloads": 82414,
+ "tags": [
+ "sentence-transformers",
+ "onnx",
+ "safetensors",
+ "distilbert",
+ "feature-extraction",
+ "sentence-similarity",
+ "transformers",
+ "it",
+ "dataset:stsb_multi_mt",
+ "dataset:unicamp-dl/mmarco",
+ "license:mit",
+ "autotrain_compatible",
+ "text-embeddings-inference",
+ "endpoints_compatible",
+ "region:us"
+ ],
+ "description": "--- pipeline_tag: sentence-similarity tags: - sentence-transformers - feature-extraction - sentence-similarity - transformers license: mit datasets: - stsb_multi_mt - unicamp-dl/mmarco language: - it library_name: sentence-transformers --- # {multi-sentence-BERTino} This is a sentence-transformers model: It maps sentences & paragraphs to a 768 dimensional dense vector space and can be used for tasks like clustering or semantic search. This model is trained from indigo-ai/BERTino using mmarco italian (200K) and stsb italian. ## Usage (Sentence-Transformers) Using this model becomes easy when you have sentence-transformers installed: Then you can use the model like this: ## Usage (FastEmbed) Using this model becomes easy when you have FastEmbed installed: Then you can use the model like this: ## Usage (HuggingFace Transformers) Without sentence-transformers, you can use the model like this: First, you pass your input through the transformer model, then you have to apply the right pooling-operation on-top of the contextualized word embeddings. ## Evaluation Results For an automated evaluation of this model, see the *Sentence Embeddings Benchmark*: ## Training The model was trained with the parameters: **DataLoader**: of length 31250 with parameters: **Loss**: with parameters: **DataLoader**: of length 360 with parameters: **Loss**: **DataLoader**: of length 31250 with parameters: **Loss**: with parameters: Parameters of the fit()-Method: ## Full Model Architecture ## Citing & Authors ",
+ "model_explanation_gemini": "Maps sentences and paragraphs to 768-dimensional vectors for tasks like clustering and semantic search, trained on Italian datasets (mmarco and stsb) using BERTino.\n\nFeatures: \n- Sentence/paragraph embeddings (768-dimensional) \n- Supports clustering and semantic search \n- Trained on Italian datasets (mmarco, stsb) \n- Based on BERTino architecture \n\nComparison: \nSimilar to other sentence-transformers but specifically fine-tuned for Italian language tasks, leveraging BERT"
+}
\ No newline at end of file
diff --git a/model_data_json/nomic-ai_modernbert-embed-base.json b/model_data_json/nomic-ai_modernbert-embed-base.json
new file mode 100644
index 0000000000000000000000000000000000000000..574c8da11e177725bf79c78d7046303c42d564a5
--- /dev/null
+++ b/model_data_json/nomic-ai_modernbert-embed-base.json
@@ -0,0 +1,25 @@
+{
+ "model_id": "nomic-ai/modernbert-embed-base",
+ "downloads": 77956,
+ "tags": [
+ "sentence-transformers",
+ "onnx",
+ "safetensors",
+ "modernbert",
+ "feature-extraction",
+ "sentence-similarity",
+ "mteb",
+ "transformers.js",
+ "en",
+ "arxiv:2402.01613",
+ "base_model:answerdotai/ModernBERT-base",
+ "base_model:finetune:answerdotai/ModernBERT-base",
+ "license:apache-2.0",
+ "model-index",
+ "autotrain_compatible",
+ "endpoints_compatible",
+ "region:us"
+ ],
+ "description": "--- pipeline_tag: sentence-similarity tags: - sentence-transformers - feature-extraction - sentence-similarity - mteb - transformers.js model-index: - name: binarize_False results: - task: type: Classification dataset: type: None name: MTEB AmazonCounterfactualClassification (en) config: en split: test revision: e8379541af4e31359cca9fbcf4b00f2671dba205 metrics: - type: accuracy value: 78.13432835820896 - type: ap value: 42.190424731303246 - type: f1 value: 72.34446401534811 - task: type: Classification dataset: type: None name: MTEB AmazonPolarityClassification config: default split: test revision: e2d317d38cd51312af73b3d32a06d1a08b442046 metrics: - type: accuracy value: 93.093825 - type: ap value: 90.03727505544286 - type: f1 value: 93.0874055138833 - task: type: Classification dataset: type: None name: MTEB AmazonReviewsClassification (en) config: en split: test revision: 1399c76144fd37290681b995c656ef9b2e06e26d metrics: - type: accuracy value: 48.428000000000004 - type: f1 value: 47.74311520203536 - task: type: Retrieval dataset: type: None name: MTEB ArguAna config: default split: test revision: c22ab2a51041ffd869aaddef7af8d8215647e41a metrics: - type: map_at_1 value: 23.898 - type: map_at_10 value: 39.775 - type: map_at_100 value: 40.827000000000005 - type: map_at_1000 value: 40.837 - type: map_at_20 value: 40.604 - type: map_at_3 value: 34.519 - type: map_at_5 value: 37.307 - type: mrr_at_1 value: 24.395 - type: mrr_at_10 value: 39.963 - type: mrr_at_100 value: 41.014 - type: mrr_at_1000 value: 41.024 - type: mrr_at_20 value: 40.791 - type: mrr_at_3 value: 34.732 - type: mrr_at_5 value: 37.480999999999995 - type: ndcg_at_1 value: 23.898 - type: ndcg_at_10 value: 48.962 - type: ndcg_at_100 value: 53.386 - type: ndcg_at_1000 value: 53.634 - type: ndcg_at_20 value: 51.898999999999994 - type: ndcg_at_3 value: 38.034 - type: ndcg_at_5 value: 43.036 - type: precision_at_1 value: 23.898 - type: precision_at_10 value: 7.852 - type: precision_at_100 value: 0.9769999999999999 - type: precision_at_1000 value: 0.1 - type: precision_at_20 value: 4.4990000000000006 - type: precision_at_3 value: 16.073999999999998 - type: precision_at_5 value: 12.063 - type: recall_at_1 value: 23.898 - type: recall_at_10 value: 78.521 - type: recall_at_100 value: 97.724 - type: recall_at_1000 value: 99.644 - type: recall_at_20 value: 89.972 - type: recall_at_3 value: 48.222 - type: recall_at_5 value: 60.313 - task: type: Clustering dataset: type: None name: MTEB ArxivClusteringP2P config: default split: test revision: a122ad7f3f0291bf49cc6f4d32aa80929df69d5d metrics: - type: v_measure value: 47.69067314293749 - type: v_measures value: [0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413] - task: type: Clustering dataset: type: None name: MTEB ArxivClusteringS2S config: default split: test revision: f910caf1a6075f7329cdf8c1a6135696f37dbd53 metrics: - type: v_measure value: 38.0916537995626 - type: v_measures value: [0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377] - task: type: Reranking dataset: type: None name: MTEB AskUbuntuDupQuestions config: default split: test revision: 2000358ca161889fa9c082cb41daa8dcfb161a54 metrics: - type: map value: 62.33195643912506 - type: mrr value: 76.43978366970057 - task: type: STS dataset: type: None name: MTEB BIOSSES config: default split: test revision: d3fb88f8f02e40887cd149695127462bbcf29b4a metrics: - type: cos_sim_pearson value: 81.20285894915236 - type: cos_sim_spearman value: 78.16322678527897 - type: euclidean_pearson value: 80.6118408638417 - type: euclidean_spearman value: 78.19033583671204 - type: manhattan_pearson value: 80.41282660275819 - type: manhattan_spearman value: 77.98611431591628 - task: type: Classification dataset: type: None name: MTEB Banking77Classification config: default split: test revision: 0fd18e25b25c072e09e0d92ab615fda904d66300 metrics: - type: accuracy value: 85.25324675324676 - type: f1 value: 85.19854235582687 - task: type: Clustering dataset: type: None name: MTEB BiorxivClusteringP2P config: default split: test revision: 65b79d1d13f80053f67aca9498d9402c2d9f1f40 metrics: - type: v_measure value: 39.65216461057432 - type: v_measures value: [0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885] - task: type: Clustering dataset: type: None name: MTEB BiorxivClusteringS2S config: default split: test revision: 258694dd0231531bc1fd9de6ceb52a0853c6d908 metrics: - type: v_measure value: 33.28787287895752 - type: v_measures value: [0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306] - task: type: Retrieval dataset: type: BeIR/cqadupstack name: MTEB CQADupstackAndroidRetrieval config: default split: test revision: f46a197baaae43b4f621051089b82a364682dfeb metrics: - type: map_at_1 value: 32.677 - type: map_at_10 value: 43.739 - type: map_at_100 value: 45.152 - type: map_at_1000 value: 45.279 - type: map_at_20 value: 44.553 - type: map_at_3 value: 40.321 - type: map_at_5 value: 42.201 - type: mrr_at_1 value: 40.2 - type: mrr_at_10 value: 49.755 - type: mrr_at_100 value: 50.468 - type: mrr_at_1000 value: 50.513 - type: mrr_at_20 value: 50.192 - type: mrr_at_3 value: 47.163 - type: mrr_at_5 value: 48.686 - type: ndcg_at_1 value: 40.2 - type: ndcg_at_10 value: 49.963 - type: ndcg_at_100 value: 54.978 - type: ndcg_at_1000 value: 56.979 - type: ndcg_at_20 value: 51.983000000000004 - type: ndcg_at_3 value: 45.086999999999996 - type: ndcg_at_5 value: 47.309 - type: precision_at_1 value: 40.2 - type: precision_at_10 value: 9.328 - type: precision_at_100 value: 1.443 - type: precision_at_1000 value: 0.19 - type: precision_at_20 value: 5.558 - type: precision_at_3 value: 21.364 - type: precision_at_5 value: 15.222 - type: recall_at_1 value: 32.677 - type: recall_at_10 value: 61.71 - type: recall_at_100 value: 82.431 - type: recall_at_1000 value: 94.896 - type: recall_at_20 value: 68.73700000000001 - type: recall_at_3 value: 47.431 - type: recall_at_5 value: 53.739000000000004 - task: type: Retrieval dataset: type: BeIR/cqadupstack name: MTEB CQADupstackEnglishRetrieval config: default split: test revision: ad9991cb51e31e31e430383c75ffb2885547b5f0 metrics: - type: map_at_1 value: 32.71 - type: map_at_10 value: 43.297000000000004 - type: map_at_100 value: 44.607 - type: map_at_1000 value: 44.729 - type: map_at_20 value: 44.013999999999996 - type: map_at_3 value: 40.213 - type: map_at_5 value: 42.004000000000005 - type: mrr_at_1 value: 40.892 - type: mrr_at_10 value: 49.394 - type: mrr_at_100 value: 50.005 - type: mrr_at_1000 value: 50.043000000000006 - type: mrr_at_20 value: 49.764 - type: mrr_at_3 value: 47.134 - type: mrr_at_5 value: 48.522 - type: ndcg_at_1 value: 40.892 - type: ndcg_at_10 value: 49.047000000000004 - type: ndcg_at_100 value: 53.266999999999996 - type: ndcg_at_1000 value: 55.096999999999994 - type: ndcg_at_20 value: 50.707 - type: ndcg_at_3 value: 44.896 - type: ndcg_at_5 value: 46.983000000000004 - type: precision_at_1 value: 40.892 - type: precision_at_10 value: 9.293 - type: precision_at_100 value: 1.473 - type: precision_at_1000 value: 0.192 - type: precision_at_20 value: 5.446 - type: precision_at_3 value: 21.592 - type: precision_at_5 value: 15.540999999999999 - type: recall_at_1 value: 32.71 - type: recall_at_10 value: 58.592999999999996 - type: recall_at_100 value: 76.242 - type: recall_at_1000 value: 87.717 - type: recall_at_20 value: 64.646 - type: recall_at_3 value: 46.253 - type: recall_at_5 value: 51.946999999999996 - task: type: Retrieval dataset: type: BeIR/cqadupstack name: MTEB CQADupstackGamingRetrieval config: default split: test revision: 4885aa143210c98657558c04aaf3dc47cfb54340 metrics: - type: map_at_1 value: 41.644999999999996 - type: map_at_10 value: 53.825 - type: map_at_100 value: 54.82 - type: map_at_1000 value: 54.87499999999999 - type: map_at_20 value: 54.43 - type: map_at_3 value: 50.705 - type: map_at_5 value: 52.501 - type: mrr_at_1 value: 47.524 - type: mrr_at_10 value: 57.260999999999996 - type: mrr_at_100 value: 57.902 - type: mrr_at_1000 value: 57.931999999999995 - type: mrr_at_20 value: 57.689 - type: mrr_at_3 value: 55.089 - type: mrr_at_5 value: 56.38999999999999 - type: ndcg_at_1 value: 47.524 - type: ndcg_at_10 value: 59.41499999999999 - type: ndcg_at_100 value: 63.258 - type: ndcg_at_1000 value: 64.376 - type: ndcg_at_20 value: 61.149 - type: ndcg_at_3 value: 54.381 - type: ndcg_at_5 value: 56.89999999999999 - type: precision_at_1 value: 47.524 - type: precision_at_10 value: 9.386 - type: precision_at_100 value: 1.221 - type: precision_at_1000 value: 0.136 - type: precision_at_20 value: 5.223 - type: precision_at_3 value: 24.096 - type: precision_at_5 value: 16.364 - type: recall_at_1 value: 41.644999999999996 - type: recall_at_10 value: 72.386 - type: recall_at_100 value: 88.794 - type: recall_at_1000 value: 96.75399999999999 - type: recall_at_20 value: 78.74 - type: recall_at_3 value: 59.028000000000006 - type: recall_at_5 value: 65.197 - task: type: Retrieval dataset: type: BeIR/cqadupstack name: MTEB CQADupstackGisRetrieval config: default split: test revision: 5003b3064772da1887988e05400cf3806fe491f2 metrics: - type: map_at_1 value: 28.648 - type: map_at_10 value: 36.388999999999996 - type: map_at_100 value: 37.372 - type: map_at_1000 value: 37.457 - type: map_at_20 value: 36.912 - type: map_at_3 value: 34.076 - type: map_at_5 value: 35.415 - type: mrr_at_1 value: 30.508000000000003 - type: mrr_at_10 value: 38.132 - type: mrr_at_100 value: 39.04 - type: mrr_at_1000 value: 39.106 - type: mrr_at_20 value: 38.643 - type: mrr_at_3 value: 35.876000000000005 - type: mrr_at_5 value: 37.208999999999996 - type: ndcg_at_1 value: 30.508000000000003 - type: ndcg_at_10 value: 40.762 - type: ndcg_at_100 value: 45.732 - type: ndcg_at_1000 value: 47.799 - type: ndcg_at_20 value: 42.591 - type: ndcg_at_3 value: 36.266999999999996 - type: ndcg_at_5 value: 38.58 - type: precision_at_1 value: 30.508000000000003 - type: precision_at_10 value: 6.010999999999999 - type: precision_at_100 value: 0.897 - type: precision_at_1000 value: 0.11100000000000002 - type: precision_at_20 value: 3.412 - type: precision_at_3 value: 14.991 - type: precision_at_5 value: 10.328 - type: recall_at_1 value: 28.648 - type: recall_at_10 value: 52.342999999999996 - type: recall_at_100 value: 75.268 - type: recall_at_1000 value: 90.641 - type: recall_at_20 value: 59.303 - type: recall_at_3 value: 40.447 - type: recall_at_5 value: 46.117000000000004 - task: type: Retrieval dataset: type: BeIR/cqadupstack name: MTEB CQADupstackMathematicaRetrieval config: default split: test revision: 90fceea13679c63fe563ded68f3b6f06e50061de metrics: - type: map_at_1 value: 18.476 - type: map_at_10 value: 27.148 - type: map_at_100 value: 28.317999999999998 - type: map_at_1000 value: 28.427999999999997 - type: map_at_20 value: 27.764 - type: map_at_3 value: 24.801000000000002 - type: map_at_5 value: 26.133 - type: mrr_at_1 value: 22.886 - type: mrr_at_10 value: 31.741000000000003 - type: mrr_at_100 value: 32.708 - type: mrr_at_1000 value: 32.769 - type: mrr_at_20 value: 32.296 - type: mrr_at_3 value: 29.498 - type: mrr_at_5 value: 30.773 - type: ndcg_at_1 value: 22.886 - type: ndcg_at_10 value: 32.265 - type: ndcg_at_100 value: 37.829 - type: ndcg_at_1000 value: 40.558 - type: ndcg_at_20 value: 34.372 - type: ndcg_at_3 value: 28.105000000000004 - type: ndcg_at_5 value: 30.04 - type: precision_at_1 value: 22.886 - type: precision_at_10 value: 5.808 - type: precision_at_100 value: 0.985 - type: precision_at_1000 value: 0.13699999999999998 - type: precision_at_20 value: 3.495 - type: precision_at_3 value: 13.639999999999999 - type: precision_at_5 value: 9.577 - type: recall_at_1 value: 18.476 - type: recall_at_10 value: 43.442 - type: recall_at_100 value: 67.376 - type: recall_at_1000 value: 86.874 - type: recall_at_20 value: 51.038 - type: recall_at_3 value: 31.785999999999998 - type: recall_at_5 value: 36.858999999999995 - task: type: Retrieval dataset: type: BeIR/cqadupstack name: MTEB CQADupstackPhysicsRetrieval config: default split: test revision: 79531abbd1fb92d06c6d6315a0cbbbf5bb247ea4 metrics: - type: map_at_1 value: 29.098000000000003 - type: map_at_10 value: 38.97 - type: map_at_100 value: 40.293 - type: map_at_1000 value: 40.397 - type: map_at_20 value: 39.778999999999996 - type: map_at_3 value: 35.723 - type: map_at_5 value: 37.519999999999996 - type: mrr_at_1 value: 35.515 - type: mrr_at_10 value: 44.55 - type: mrr_at_100 value: 45.37 - type: mrr_at_1000 value: 45.412 - type: mrr_at_20 value: 45.054 - type: mrr_at_3 value: 41.835 - type: mrr_at_5 value: 43.356 - type: ndcg_at_1 value: 35.515 - type: ndcg_at_10 value: 44.91 - type: ndcg_at_100 value: 50.27700000000001 - type: ndcg_at_1000 value: 52.215 - type: ndcg_at_20 value: 47.235 - type: ndcg_at_3 value: 39.505 - type: ndcg_at_5 value: 42.016 - type: precision_at_1 value: 35.515 - type: precision_at_10 value: 8.152 - type: precision_at_100 value: 1.262 - type: precision_at_1000 value: 0.16 - type: precision_at_20 value: 4.851 - type: precision_at_3 value: 18.447 - type: precision_at_5 value: 13.321 - type: recall_at_1 value: 29.098000000000003 - type: recall_at_10 value: 57.115 - type: recall_at_100 value: 79.467 - type: recall_at_1000 value: 92.162 - type: recall_at_20 value: 65.161 - type: recall_at_3 value: 42.254000000000005 - type: recall_at_5 value: 48.415 - task: type: Retrieval dataset: type: BeIR/cqadupstack name: MTEB CQADupstackProgrammersRetrieval config: default split: test revision: 6184bc1440d2dbc7612be22b50686b8826d22b32 metrics: - type: map_at_1 value: 27.372000000000003 - type: map_at_10 value: 37.781 - type: map_at_100 value: 39.128 - type: map_at_1000 value: 39.238 - type: map_at_20 value: 38.592 - type: map_at_3 value: 34.782999999999994 - type: map_at_5 value: 36.466 - type: mrr_at_1 value: 33.904 - type: mrr_at_10 value: 43.15 - type: mrr_at_100 value: 44.049 - type: mrr_at_1000 value: 44.107 - type: mrr_at_20 value: 43.721 - type: mrr_at_3 value: 40.677 - type: mrr_at_5 value: 42.19 - type: ndcg_at_1 value: 33.904 - type: ndcg_at_10 value: 43.527 - type: ndcg_at_100 value: 49.004999999999995 - type: ndcg_at_1000 value: 51.276999999999994 - type: ndcg_at_20 value: 45.988 - type: ndcg_at_3 value: 38.824999999999996 - type: ndcg_at_5 value: 41.04 - type: precision_at_1 value: 33.904 - type: precision_at_10 value: 7.854 - type: precision_at_100 value: 1.2309999999999999 - type: precision_at_1000 value: 0.16 - type: precision_at_20 value: 4.692 - type: precision_at_3 value: 18.531 - type: precision_at_5 value: 13.150999999999998 - type: recall_at_1 value: 27.372000000000003 - type: recall_at_10 value: 55.245999999999995 - type: recall_at_100 value: 78.278 - type: recall_at_1000 value: 93.718 - type: recall_at_20 value: 64.095 - type: recall_at_3 value: 41.665 - type: recall_at_5 value: 47.632000000000005 - task: type: Retrieval dataset: type: BeIR/cqadupstack name: MTEB CQADupstackRetrieval config: default split: test revision: f46a197baaae43b4f621051089b82a364682dfeb metrics: - type: map_at_1 value: 27.734166666666667 - type: map_at_10 value: 36.858 - type: map_at_100 value: 38.043833333333325 - type: map_at_1000 value: 38.15541666666667 - type: map_at_20 value: 37.521249999999995 - type: map_at_3 value: 34.07658333333333 - type: map_at_5 value: 35.62683333333333 - type: mrr_at_1 value: 32.676249999999996 - type: mrr_at_10 value: 40.999 - type: mrr_at_100 value: 41.835 - type: mrr_at_1000 value: 41.8895 - type: mrr_at_20 value: 41.4865 - type: mrr_at_3 value: 38.645 - type: mrr_at_5 value: 39.99725000000001 - type: ndcg_at_1 value: 32.676249999999996 - type: ndcg_at_10 value: 42.08016666666666 - type: ndcg_at_100 value: 47.082750000000004 - type: ndcg_at_1000 value: 49.276583333333335 - type: ndcg_at_20 value: 44.04808333333334 - type: ndcg_at_3 value: 37.43375 - type: ndcg_at_5 value: 39.623000000000005 - type: precision_at_1 value: 32.676249999999996 - type: precision_at_10 value: 7.271 - type: precision_at_100 value: 1.1458333333333333 - type: precision_at_1000 value: 0.152 - type: precision_at_20 value: 4.282916666666667 - type: precision_at_3 value: 17.061416666666666 - type: precision_at_5 value: 12.05466666666667 - type: recall_at_1 value: 27.734166666666667 - type: recall_at_10 value: 53.33574999999999 - type: recall_at_100 value: 75.16275 - type: recall_at_1000 value: 90.34891666666665 - type: recall_at_20 value: 60.4935 - type: recall_at_3 value: 40.377916666666664 - type: recall_at_5 value: 46.0195 - task: type: Retrieval dataset: type: BeIR/cqadupstack name: MTEB CQADupstackStatsRetrieval config: default split: test revision: 65ac3a16b8e91f9cee4c9828cc7c335575432a2a metrics: - type: map_at_1 value: 25.653 - type: map_at_10 value: 32.151 - type: map_at_100 value: 33.152 - type: map_at_1000 value: 33.243 - type: map_at_20 value: 32.717 - type: map_at_3 value: 30.287 - type: map_at_5 value: 31.25 - type: mrr_at_1 value: 28.988000000000003 - type: mrr_at_10 value: 35.131 - type: mrr_at_100 value: 36.002 - type: mrr_at_1000 value: 36.069 - type: mrr_at_20 value: 35.61 - type: mrr_at_3 value: 33.308 - type: mrr_at_5 value: 34.259 - type: ndcg_at_1 value: 28.988000000000003 - type: ndcg_at_10 value: 35.988 - type: ndcg_at_100 value: 40.764 - type: ndcg_at_1000 value: 43.112 - type: ndcg_at_20 value: 37.852999999999994 - type: ndcg_at_3 value: 32.562000000000005 - type: ndcg_at_5 value: 33.983000000000004 - type: precision_at_1 value: 28.988000000000003 - type: precision_at_10 value: 5.475 - type: precision_at_100 value: 0.8500000000000001 - type: precision_at_1000 value: 0.11199999999999999 - type: precision_at_20 value: 3.229 - type: precision_at_3 value: 13.905999999999999 - type: precision_at_5 value: 9.386999999999999 - type: recall_at_1 value: 25.653 - type: recall_at_10 value: 44.962 - type: recall_at_100 value: 66.405 - type: recall_at_1000 value: 83.88799999999999 - type: recall_at_20 value: 51.79899999999999 - type: recall_at_3 value: 35.144999999999996 - type: recall_at_5 value: 38.814 - task: type: Retrieval dataset: type: BeIR/cqadupstack name: MTEB CQADupstackTexRetrieval config: default split: test revision: 46989137a86843e03a6195de44b09deda022eec7 metrics: - type: map_at_1 value: 17.825 - type: map_at_10 value: 25.592 - type: map_at_100 value: 26.613999999999997 - type: map_at_1000 value: 26.734 - type: map_at_20 value: 26.115 - type: map_at_3 value: 23.119 - type: map_at_5 value: 24.54 - type: mrr_at_1 value: 21.335 - type: mrr_at_10 value: 29.165000000000003 - type: mrr_at_100 value: 30.049 - type: mrr_at_1000 value: 30.121 - type: mrr_at_20 value: 29.639 - type: mrr_at_3 value: 26.863999999999997 - type: mrr_at_5 value: 28.185 - type: ndcg_at_1 value: 21.335 - type: ndcg_at_10 value: 30.357 - type: ndcg_at_100 value: 35.410000000000004 - type: ndcg_at_1000 value: 38.24 - type: ndcg_at_20 value: 32.08 - type: ndcg_at_3 value: 25.95 - type: ndcg_at_5 value: 28.081 - type: precision_at_1 value: 21.335 - type: precision_at_10 value: 5.506 - type: precision_at_100 value: 0.928 - type: precision_at_1000 value: 0.135 - type: precision_at_20 value: 3.2550000000000003 - type: precision_at_3 value: 12.239 - type: precision_at_5 value: 8.885 - type: recall_at_1 value: 17.825 - type: recall_at_10 value: 41.105999999999995 - type: recall_at_100 value: 64.17 - type: recall_at_1000 value: 84.19200000000001 - type: recall_at_20 value: 47.497 - type: recall_at_3 value: 28.862 - type: recall_at_5 value: 34.348 - task: type: Retrieval dataset: type: BeIR/cqadupstack name: MTEB CQADupstackUnixRetrieval config: default split: test revision: 6c6430d3a6d36f8d2a829195bc5dc94d7e063e53 metrics: - type: map_at_1 value: 29.435 - type: map_at_10 value: 38.261 - type: map_at_100 value: 39.242 - type: map_at_1000 value: 39.347 - type: map_at_20 value: 38.742 - type: map_at_3 value: 35.457 - type: map_at_5 value: 37.043 - type: mrr_at_1 value: 34.235 - type: mrr_at_10 value: 42.24 - type: mrr_at_100 value: 42.988 - type: mrr_at_1000 value: 43.043 - type: mrr_at_20 value: 42.613 - type: mrr_at_3 value: 39.832 - type: mrr_at_5 value: 41.227000000000004 - type: ndcg_at_1 value: 34.235 - type: ndcg_at_10 value: 43.384 - type: ndcg_at_100 value: 48.14 - type: ndcg_at_1000 value: 50.414 - type: ndcg_at_20 value: 44.913 - type: ndcg_at_3 value: 38.454 - type: ndcg_at_5 value: 40.776 - type: precision_at_1 value: 34.235 - type: precision_at_10 value: 7.164 - type: precision_at_100 value: 1.065 - type: precision_at_1000 value: 0.13699999999999998 - type: precision_at_20 value: 4.021 - type: precision_at_3 value: 17.226 - type: precision_at_5 value: 12.071 - type: recall_at_1 value: 29.435 - type: recall_at_10 value: 54.93900000000001 - type: recall_at_100 value: 76.176 - type: recall_at_1000 value: 91.989 - type: recall_at_20 value: 60.451 - type: recall_at_3 value: 41.332 - type: recall_at_5 value: 47.316 - task: type: Retrieval dataset: type: BeIR/cqadupstack name: MTEB CQADupstackWebmastersRetrieval config: default split: test revision: 160c094312a0e1facb97e55eeddb698c0abe3571 metrics: - type: map_at_1 value: 25.605 - type: map_at_10 value: 34.162 - type: map_at_100 value: 35.827999999999996 - type: map_at_1000 value: 36.04 - type: map_at_20 value: 35.016000000000005 - type: map_at_3 value: 30.984 - type: map_at_5 value: 32.717 - type: mrr_at_1 value: 30.435000000000002 - type: mrr_at_10 value: 38.681 - type: mrr_at_100 value: 39.656000000000006 - type: mrr_at_1000 value: 39.71 - type: mrr_at_20 value: 39.208999999999996 - type: mrr_at_3 value: 35.903 - type: mrr_at_5 value: 37.454 - type: ndcg_at_1 value: 30.435000000000002 - type: ndcg_at_10 value: 39.916000000000004 - type: ndcg_at_100 value: 45.958 - type: ndcg_at_1000 value: 48.449999999999996 - type: ndcg_at_20 value: 42.085 - type: ndcg_at_3 value: 34.696 - type: ndcg_at_5 value: 37.147000000000006 - type: precision_at_1 value: 30.435000000000002 - type: precision_at_10 value: 7.767 - type: precision_at_100 value: 1.547 - type: precision_at_1000 value: 0.23800000000000002 - type: precision_at_20 value: 4.941 - type: precision_at_3 value: 16.073999999999998 - type: precision_at_5 value: 11.937000000000001 - type: recall_at_1 value: 25.605 - type: recall_at_10 value: 50.654999999999994 - type: recall_at_100 value: 77.609 - type: recall_at_1000 value: 93.518 - type: recall_at_20 value: 58.845000000000006 - type: recall_at_3 value: 36.272 - type: recall_at_5 value: 42.596000000000004 - task: type: Retrieval dataset: type: BeIR/cqadupstack name: MTEB CQADupstackWordpressRetrieval config: default split: test revision: 4ffe81d471b1924886b33c7567bfb200e9eec5c4 metrics: - type: map_at_1 value: 23.666 - type: map_at_10 value: 30.980999999999998 - type: map_at_100 value: 32.0 - type: map_at_1000 value: 32.098 - type: map_at_20 value: 31.621 - type: map_at_3 value: 28.449999999999996 - type: map_at_5 value: 29.731999999999996 - type: mrr_at_1 value: 25.692999999999998 - type: mrr_at_10 value: 32.788000000000004 - type: mrr_at_100 value: 33.783 - type: mrr_at_1000 value: 33.849000000000004 - type: mrr_at_20 value: 33.408 - type: mrr_at_3 value: 30.561 - type: mrr_at_5 value: 31.716 - type: ndcg_at_1 value: 25.692999999999998 - type: ndcg_at_10 value: 35.428 - type: ndcg_at_100 value: 40.375 - type: ndcg_at_1000 value: 42.802 - type: ndcg_at_20 value: 37.621 - type: ndcg_at_3 value: 30.476999999999997 - type: ndcg_at_5 value: 32.621 - type: precision_at_1 value: 25.692999999999998 - type: precision_at_10 value: 5.508 - type: precision_at_100 value: 0.848 - type: precision_at_1000 value: 0.116 - type: precision_at_20 value: 3.272 - type: precision_at_3 value: 12.631 - type: precision_at_5 value: 8.872 - type: recall_at_1 value: 23.666 - type: recall_at_10 value: 47.532000000000004 - type: recall_at_100 value: 69.73700000000001 - type: recall_at_1000 value: 87.83800000000001 - type: recall_at_20 value: 55.61000000000001 - type: recall_at_3 value: 34.06 - type: recall_at_5 value: 39.254 - task: type: Retrieval dataset: type: None name: MTEB ClimateFEVER config: default split: test revision: 47f2ac6acb640fc46020b02a5b59fdda04d39380 metrics: - type: map_at_1 value: 16.337 - type: map_at_10 value: 26.488 - type: map_at_100 value: 28.415000000000003 - type: map_at_1000 value: 28.584 - type: map_at_20 value: 27.557 - type: map_at_3 value: 22.665 - type: map_at_5 value: 24.542 - type: mrr_at_1 value: 36.417 - type: mrr_at_10 value: 48.001 - type: mrr_at_100 value: 48.784 - type: mrr_at_1000 value: 48.809000000000005 - type: mrr_at_20 value: 48.507 - type: mrr_at_3 value: 45.103 - type: mrr_at_5 value: 46.843 - type: ndcg_at_1 value: 36.417 - type: ndcg_at_10 value: 35.67 - type: ndcg_at_100 value: 42.716 - type: ndcg_at_1000 value: 45.639 - type: ndcg_at_20 value: 38.471 - type: ndcg_at_3 value: 30.444 - type: ndcg_at_5 value: 32.004 - type: precision_at_1 value: 36.417 - type: precision_at_10 value: 10.73 - type: precision_at_100 value: 1.833 - type: precision_at_1000 value: 0.23800000000000002 - type: precision_at_20 value: 6.596 - type: precision_at_3 value: 22.302 - type: precision_at_5 value: 16.521 - type: recall_at_1 value: 16.337 - type: recall_at_10 value: 40.671 - type: recall_at_100 value: 64.55300000000001 - type: recall_at_1000 value: 80.934 - type: recall_at_20 value: 48.381 - type: recall_at_3 value: 27.279999999999998 - type: recall_at_5 value: 32.621 - task: type: Retrieval dataset: type: None name: MTEB DBPedia config: default split: test revision: c0f706b76e590d620bd6618b3ca8efdd34e2d659 metrics: - type: map_at_1 value: 9.056000000000001 - type: map_at_10 value: 19.419 - type: map_at_100 value: 27.069 - type: map_at_1000 value: 28.666000000000004 - type: map_at_20 value: 22.434 - type: map_at_3 value: 13.895 - type: map_at_5 value: 16.121 - type: mrr_at_1 value: 69.0 - type: mrr_at_10 value: 75.804 - type: mrr_at_100 value: 76.117 - type: mrr_at_1000 value: 76.125 - type: mrr_at_20 value: 76.009 - type: mrr_at_3 value: 74.375 - type: mrr_at_5 value: 75.4 - type: ndcg_at_1 value: 57.49999999999999 - type: ndcg_at_10 value: 41.495 - type: ndcg_at_100 value: 45.208 - type: ndcg_at_1000 value: 52.221 - type: ndcg_at_20 value: 40.617999999999995 - type: ndcg_at_3 value: 46.592 - type: ndcg_at_5 value: 43.559 - type: precision_at_1 value: 69.0 - type: precision_at_10 value: 32.574999999999996 - type: precision_at_100 value: 10.205 - type: precision_at_1000 value: 2.036 - type: precision_at_20 value: 24.687 - type: precision_at_3 value: 49.75 - type: precision_at_5 value: 42.0 - type: recall_at_1 value: 9.056000000000001 - type: recall_at_10 value: 24.866 - type: recall_at_100 value: 50.097 - type: recall_at_1000 value: 72.038 - type: recall_at_20 value: 31.858999999999998 - type: recall_at_3 value: 15.096000000000002 - type: recall_at_5 value: 18.548000000000002 - task: type: Classification dataset: type: None name: MTEB EmotionClassification config: default split: test revision: 4f58c6b202a23cf9a4da393831edf4f9183cad37 metrics: - type: accuracy value: 48.259999999999984 - type: f1 value: 43.1498589523159 - task: type: Retrieval dataset: type: None name: MTEB FEVER config: default split: test revision: bea83ef9e8fb933d90a2f1d5515737465d613e12 metrics: - type: map_at_1 value: 74.798 - type: map_at_10 value: 83.454 - type: map_at_100 value: 83.623 - type: map_at_1000 value: 83.635 - type: map_at_20 value: 83.55 - type: map_at_3 value: 82.392 - type: map_at_5 value: 83.167 - type: mrr_at_1 value: 80.708 - type: mrr_at_10 value: 88.377 - type: mrr_at_100 value: 88.411 - type: mrr_at_1000 value: 88.411 - type: mrr_at_20 value: 88.402 - type: mrr_at_3 value: 87.646 - type: mrr_at_5 value: 88.232 - type: ndcg_at_1 value: 80.708 - type: ndcg_at_10 value: 87.35199999999999 - type: ndcg_at_100 value: 87.91600000000001 - type: ndcg_at_1000 value: 88.12299999999999 - type: ndcg_at_20 value: 87.593 - type: ndcg_at_3 value: 85.738 - type: ndcg_at_5 value: 86.845 - type: precision_at_1 value: 80.708 - type: precision_at_10 value: 10.432 - type: precision_at_100 value: 1.091 - type: precision_at_1000 value: 0.11299999999999999 - type: precision_at_20 value: 5.296 - type: precision_at_3 value: 32.778 - type: precision_at_5 value: 20.399 - type: recall_at_1 value: 74.798 - type: recall_at_10 value: 94.459 - type: recall_at_100 value: 96.614 - type: recall_at_1000 value: 97.868 - type: recall_at_20 value: 95.254 - type: recall_at_3 value: 90.144 - type: recall_at_5 value: 92.965 - task: type: Retrieval dataset: type: None name: MTEB FiQA2018 config: default split: test revision: 27a168819829fe9bcd655c2df245fb19452e8e06 metrics: - type: map_at_1 value: 20.008 - type: map_at_10 value: 32.731 - type: map_at_100 value: 34.467999999999996 - type: map_at_1000 value: 34.643 - type: map_at_20 value: 33.717000000000006 - type: map_at_3 value: 28.427999999999997 - type: map_at_5 value: 30.788 - type: mrr_at_1 value: 40.586 - type: mrr_at_10 value: 49.056 - type: mrr_at_100 value: 49.887 - type: mrr_at_1000 value: 49.929 - type: mrr_at_20 value: 49.552 - type: mrr_at_3 value: 46.785 - type: mrr_at_5 value: 48.004000000000005 - type: ndcg_at_1 value: 40.586 - type: ndcg_at_10 value: 40.589999999999996 - type: ndcg_at_100 value: 47.03 - type: ndcg_at_1000 value: 49.994 - type: ndcg_at_20 value: 43.229 - type: ndcg_at_3 value: 37.061 - type: ndcg_at_5 value: 37.992 - type: precision_at_1 value: 40.586 - type: precision_at_10 value: 11.219 - type: precision_at_100 value: 1.781 - type: precision_at_1000 value: 0.232 - type: precision_at_20 value: 6.705 - type: precision_at_3 value: 24.743000000000002 - type: precision_at_5 value: 18.086 - type: recall_at_1 value: 20.008 - type: recall_at_10 value: 47.412 - type: recall_at_100 value: 71.274 - type: recall_at_1000 value: 88.898 - type: recall_at_20 value: 55.706999999999994 - type: recall_at_3 value: 33.346 - type: recall_at_5 value: 39.112 - task: type: Retrieval dataset: type: None name: MTEB HotpotQA config: default split: test revision: ab518f4d6fcca38d87c25209f94beba119d02014 metrics: - type: map_at_1 value: 41.789 - type: map_at_10 value: 57.898 - type: map_at_100 value: 58.632 - type: map_at_1000 value: 58.693 - type: map_at_20 value: 58.314 - type: map_at_3 value: 55.236 - type: map_at_5 value: 56.852999999999994 - type: mrr_at_1 value: 83.57900000000001 - type: mrr_at_10 value: 87.631 - type: mrr_at_100 value: 87.764 - type: mrr_at_1000 value: 87.77000000000001 - type: mrr_at_20 value: 87.70700000000001 - type: mrr_at_3 value: 87.02499999999999 - type: mrr_at_5 value: 87.34100000000001 - type: ndcg_at_1 value: 83.57900000000001 - type: ndcg_at_10 value: 67.11399999999999 - type: ndcg_at_100 value: 69.686 - type: ndcg_at_1000 value: 70.926 - type: ndcg_at_20 value: 68.119 - type: ndcg_at_3 value: 63.402 - type: ndcg_at_5 value: 65.354 - type: precision_at_1 value: 83.57900000000001 - type: precision_at_10 value: 13.333 - type: precision_at_100 value: 1.537 - type: precision_at_1000 value: 0.16999999999999998 - type: precision_at_20 value: 6.988999999999999 - type: precision_at_3 value: 38.929 - type: precision_at_5 value: 24.897 - type: recall_at_1 value: 41.789 - type: recall_at_10 value: 66.664 - type: recall_at_100 value: 76.833 - type: recall_at_1000 value: 85.14500000000001 - type: recall_at_20 value: 69.892 - type: recall_at_3 value: 58.392999999999994 - type: recall_at_5 value: 62.242 - task: type: Classification dataset: type: None name: MTEB ImdbClassification config: default split: test revision: 3d86128a09e091d6018b6d26cad27f2739fc2db7 metrics: - type: accuracy value: 86.6108 - type: ap value: 81.63890253106925 - type: f1 value: 86.54585789538082 - task: type: Retrieval dataset: type: None name: MTEB MSMARCO config: default split: dev revision: c5a29a104738b98a9e76336939199e264163d4a0 metrics: - type: map_at_1 value: 22.407 - type: map_at_10 value: 34.603 - type: map_at_100 value: 35.808 - type: map_at_1000 value: 35.855 - type: map_at_20 value: 35.368 - type: map_at_3 value: 30.764000000000003 - type: map_at_5 value: 32.964 - type: mrr_at_1 value: 23.009 - type: mrr_at_10 value: 35.136 - type: mrr_at_100 value: 36.284 - type: mrr_at_1000 value: 36.325 - type: mrr_at_20 value: 35.869 - type: mrr_at_3 value: 31.351000000000003 - type: mrr_at_5 value: 33.54 - type: ndcg_at_1 value: 23.009 - type: ndcg_at_10 value: 41.471999999999994 - type: ndcg_at_100 value: 47.211999999999996 - type: ndcg_at_1000 value: 48.361 - type: ndcg_at_20 value: 44.169000000000004 - type: ndcg_at_3 value: 33.646 - type: ndcg_at_5 value: 37.580000000000005 - type: precision_at_1 value: 23.009 - type: precision_at_10 value: 6.54 - type: precision_at_100 value: 0.941 - type: precision_at_1000 value: 0.104 - type: precision_at_20 value: 3.832 - type: precision_at_3 value: 14.283999999999999 - type: precision_at_5 value: 10.564 - type: recall_at_1 value: 22.407 - type: recall_at_10 value: 62.678999999999995 - type: recall_at_100 value: 89.09700000000001 - type: recall_at_1000 value: 97.822 - type: recall_at_20 value: 73.116 - type: recall_at_3 value: 41.4 - type: recall_at_5 value: 50.855 - task: type: Classification dataset: type: None name: MTEB MTOPDomainClassification (en) config: en split: test revision: d80d48c1eb48d3562165c59d59d0034df9fff0bf metrics: - type: accuracy value: 92.94573643410853 - type: f1 value: 92.73148878666994 - task: type: Classification dataset: type: None name: MTEB MTOPIntentClassification (en) config: en split: test revision: ae001d0e6b1228650b7bd1c2c65fb50ad11a8aba metrics: - type: accuracy value: 77.86137710898313 - type: f1 value: 60.360562463738724 - task: type: Classification dataset: type: None name: MTEB MassiveIntentClassification (en) config: en split: test revision: 31efe3c427b0bae9c22cbb560b8f15491cc6bed7 metrics: - type: accuracy value: 73.83322125084062 - type: f1 value: 71.61864304680206 - task: type: Classification dataset: type: None name: MTEB MassiveScenarioClassification (en) config: en split: test revision: 7d571f92784cd94a019292a1f45445077d0ef634 metrics: - type: accuracy value: 77.50504371217215 - type: f1 value: 77.52039268347185 - task: type: Clustering dataset: type: None name: MTEB MedrxivClusteringP2P config: default split: test revision: e7a26af6f3ae46b30dde8737f02c07b1505bcc73 metrics: - type: v_measure value: 34.346952648910225 - type: v_measures value: [0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024] - task: type: Clustering dataset: type: None name: MTEB MedrxivClusteringS2S config: default split: test revision: 35191c8c0dca72d8ff3efcd72aa802307d469663 metrics: - type: v_measure value: 32.19992734583148 - type: v_measures value: [0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027] - task: type: Reranking dataset: type: None name: MTEB MindSmallReranking config: default split: test revision: 3bdac13927fdc888b903db93b2ffdbd90b295a69 metrics: - type: map value: 30.62309561205373 - type: mrr value: 31.707879717902554 - task: type: Retrieval dataset: type: None name: MTEB NFCorpus config: default split: test revision: ec0fa4fe99da2ff19ca1214b7966684033a58814 metrics: - type: map_at_1 value: 5.668 - type: map_at_10 value: 12.225999999999999 - type: map_at_100 value: 15.122 - type: map_at_1000 value: 16.422 - type: map_at_20 value: 13.361999999999998 - type: map_at_3 value: 9.083 - type: map_at_5 value: 10.5 - type: mrr_at_1 value: 46.44 - type: mrr_at_10 value: 53.553 - type: mrr_at_100 value: 54.15 - type: mrr_at_1000 value: 54.193000000000005 - type: mrr_at_20 value: 53.837 - type: mrr_at_3 value: 51.702999999999996 - type: mrr_at_5 value: 52.647 - type: ndcg_at_1 value: 44.272 - type: ndcg_at_10 value: 33.395 - type: ndcg_at_100 value: 29.976999999999997 - type: ndcg_at_1000 value: 38.388 - type: ndcg_at_20 value: 30.606 - type: ndcg_at_3 value: 39.212 - type: ndcg_at_5 value: 36.611 - type: precision_at_1 value: 46.129999999999995 - type: precision_at_10 value: 24.334 - type: precision_at_100 value: 7.553999999999999 - type: precision_at_1000 value: 1.994 - type: precision_at_20 value: 17.678 - type: precision_at_3 value: 36.326 - type: precision_at_5 value: 31.330999999999996 - type: recall_at_1 value: 5.668 - type: recall_at_10 value: 15.837000000000002 - type: recall_at_100 value: 29.845 - type: recall_at_1000 value: 60.563 - type: recall_at_20 value: 18.587999999999997 - type: recall_at_3 value: 10.096 - type: recall_at_5 value: 12.261 - task: type: Retrieval dataset: type: None name: MTEB NQ config: default split: test revision: b774495ed302d8c44a3a7ea25c90dbce03968f31 metrics: - type: map_at_1 value: 39.335 - type: map_at_10 value: 54.932 - type: map_at_100 value: 55.742000000000004 - type: map_at_1000 value: 55.766000000000005 - type: map_at_20 value: 55.504 - type: map_at_3 value: 50.904 - type: map_at_5 value: 53.388999999999996 - type: mrr_at_1 value: 44.003 - type: mrr_at_10 value: 57.419 - type: mrr_at_100 value: 57.963 - type: mrr_at_1000 value: 57.981 - type: mrr_at_20 value: 57.80499999999999 - type: mrr_at_3 value: 54.30199999999999 - type: mrr_at_5 value: 56.257000000000005 - type: ndcg_at_1 value: 43.974999999999994 - type: ndcg_at_10 value: 62.153999999999996 - type: ndcg_at_100 value: 65.326 - type: ndcg_at_1000 value: 65.862 - type: ndcg_at_20 value: 63.922999999999995 - type: ndcg_at_3 value: 54.834 - type: ndcg_at_5 value: 58.857000000000006 - type: precision_at_1 value: 43.974999999999994 - type: precision_at_10 value: 9.722 - type: precision_at_100 value: 1.153 - type: precision_at_1000 value: 0.12 - type: precision_at_20 value: 5.3 - type: precision_at_3 value: 24.392 - type: precision_at_5 value: 16.993 - type: recall_at_1 value: 39.335 - type: recall_at_10 value: 81.501 - type: recall_at_100 value: 94.851 - type: recall_at_1000 value: 98.817 - type: recall_at_20 value: 87.968 - type: recall_at_3 value: 62.795 - type: recall_at_5 value: 71.985 - task: type: Retrieval dataset: type: None name: MTEB QuoraRetrieval config: default split: test revision: e4e08e0b7dbe3c8700f0daef558ff32256715259 metrics: - type: map_at_1 value: 71.222 - type: map_at_10 value: 85.193 - type: map_at_100 value: 85.802 - type: map_at_1000 value: 85.81800000000001 - type: map_at_20 value: 85.587 - type: map_at_3 value: 82.253 - type: map_at_5 value: 84.142 - type: mrr_at_1 value: 82.04 - type: mrr_at_10 value: 88.101 - type: mrr_at_100 value: 88.196 - type: mrr_at_1000 value: 88.196 - type: mrr_at_20 value: 88.175 - type: mrr_at_3 value: 87.145 - type: mrr_at_5 value: 87.825 - type: ndcg_at_1 value: 82.04 - type: ndcg_at_10 value: 88.849 - type: ndcg_at_100 value: 89.992 - type: ndcg_at_1000 value: 90.089 - type: ndcg_at_20 value: 89.468 - type: ndcg_at_3 value: 86.06899999999999 - type: ndcg_at_5 value: 87.669 - type: precision_at_1 value: 82.04 - type: precision_at_10 value: 13.447000000000001 - type: precision_at_100 value: 1.528 - type: precision_at_1000 value: 0.157 - type: precision_at_20 value: 7.116 - type: precision_at_3 value: 37.617 - type: precision_at_5 value: 24.776 - type: recall_at_1 value: 71.222 - type: recall_at_10 value: 95.73899999999999 - type: recall_at_100 value: 99.572 - type: recall_at_1000 value: 99.988 - type: recall_at_20 value: 97.725 - type: recall_at_3 value: 87.742 - type: recall_at_5 value: 92.23400000000001 - task: type: Clustering dataset: type: None name: MTEB RedditClustering config: default split: test revision: 24640382cdbf8abc73003fb0fa6d111a705499eb metrics: - type: v_measure value: 56.502005725283524 - type: v_measures value: [0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237] - task: type: Clustering dataset: type: None name: MTEB RedditClusteringP2P config: default split: test revision: 385e3cb46b4cfa89021f56c4380204149d0efe33 metrics: - type: v_measure value: 63.14989421688691 - type: v_measures value: [0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534] - task: type: Retrieval dataset: type: None name: MTEB SCIDOCS config: default split: test revision: f8c2fcf00f625baaa80f62ec5bd9e1fff3b8ae88 metrics: - type: map_at_1 value: 4.4830000000000005 - type: map_at_10 value: 11.04 - type: map_at_100 value: 12.764000000000001 - type: map_at_1000 value: 13.04 - type: map_at_20 value: 11.953 - type: map_at_3 value: 8.125 - type: map_at_5 value: 9.565999999999999 - type: mrr_at_1 value: 22.1 - type: mrr_at_10 value: 32.494 - type: mrr_at_100 value: 33.525 - type: mrr_at_1000 value: 33.596 - type: mrr_at_20 value: 33.089 - type: mrr_at_3 value: 29.416999999999998 - type: mrr_at_5 value: 31.267 - type: ndcg_at_1 value: 22.1 - type: ndcg_at_10 value: 18.587 - type: ndcg_at_100 value: 25.482 - type: ndcg_at_1000 value: 30.581999999999997 - type: ndcg_at_20 value: 21.077 - type: ndcg_at_3 value: 18.165 - type: ndcg_at_5 value: 15.676000000000002 - type: precision_at_1 value: 22.1 - type: precision_at_10 value: 9.48 - type: precision_at_100 value: 1.942 - type: precision_at_1000 value: 0.316 - type: precision_at_20 value: 6.175 - type: precision_at_3 value: 17.033 - type: precision_at_5 value: 13.719999999999999 - type: recall_at_1 value: 4.4830000000000005 - type: recall_at_10 value: 19.208 - type: recall_at_100 value: 39.417 - type: recall_at_1000 value: 64.235 - type: recall_at_20 value: 25.057000000000002 - type: recall_at_3 value: 10.348 - type: recall_at_5 value: 13.893 - task: type: STS dataset: type: None name: MTEB SICK-R config: default split: test revision: 20a6d6f312dd54037fe07a32d58e5e168867909d metrics: - type: cos_sim_pearson value: 83.50181312649208 - type: cos_sim_spearman value: 79.92900705478993 - type: euclidean_pearson value: 81.13482128094503 - type: euclidean_spearman value: 79.92732266864367 - type: manhattan_pearson value: 81.06702121654993 - type: manhattan_spearman value: 79.86983106619135 - task: type: STS dataset: type: None name: MTEB STS12 config: default split: test revision: a0d554a64d88156834ff5ae9920b964011b16384 metrics: - type: cos_sim_pearson value: 83.85431681906961 - type: cos_sim_spearman value: 77.61671419416626 - type: euclidean_pearson value: 81.30538320520961 - type: euclidean_spearman value: 77.62096481461272 - type: manhattan_pearson value: 81.2306021173407 - type: manhattan_spearman value: 77.58386300715222 - task: type: STS dataset: type: None name: MTEB STS13 config: default split: test revision: 7e90230a92c190f1bf69ae9002b8cea547a64cca metrics: - type: cos_sim_pearson value: 84.98057702322754 - type: cos_sim_spearman value: 86.13305071688859 - type: euclidean_pearson value: 85.70903555966376 - type: euclidean_spearman value: 86.13150222328171 - type: manhattan_pearson value: 85.69380834788831 - type: manhattan_spearman value: 86.10784739081191 - task: type: STS dataset: type: None name: MTEB STS14 config: default split: test revision: 6031580fec1f6af667f0bd2da0a551cf4f0b2375 metrics: - type: cos_sim_pearson value: 83.43368314724589 - type: cos_sim_spearman value: 81.26767916144169 - type: euclidean_pearson value: 83.23234690932492 - type: euclidean_spearman value: 81.2671726214706 - type: manhattan_pearson value: 83.2381239261109 - type: manhattan_spearman value: 81.27674961470714 - task: type: STS dataset: type: None name: MTEB STS15 config: default split: test revision: ae752c7c21bf194d8b67fd573edf7ae58183cbe3 metrics: - type: cos_sim_pearson value: 86.8637546411748 - type: cos_sim_spearman value: 88.25330888676139 - type: euclidean_pearson value: 87.81194589390417 - type: euclidean_spearman value: 88.25258669625579 - type: manhattan_pearson value: 87.8131866998459 - type: manhattan_spearman value: 88.26523268929576 - task: type: STS dataset: type: None name: MTEB STS16 config: default split: test revision: 4d8694f8f0e0100860b497b999b3dbed754a0513 metrics: - type: cos_sim_pearson value: 83.83129743147286 - type: cos_sim_spearman value: 85.73732687732624 - type: euclidean_pearson value: 85.18051277328075 - type: euclidean_spearman value: 85.73565846174445 - type: manhattan_pearson value: 85.179029651079 - type: manhattan_spearman value: 85.75709685404729 - task: type: STS dataset: type: None name: MTEB STS17 (en-en) config: en-en split: test revision: af5e6fb845001ecf41f4c1e033ce921939a2a68d metrics: - type: cos_sim_pearson value: 87.04715794253148 - type: cos_sim_spearman value: 87.61577496386343 - type: euclidean_pearson value: 88.34713614361046 - type: euclidean_spearman value: 87.56541901567275 - type: manhattan_pearson value: 88.26010824585985 - type: manhattan_spearman value: 87.35211736948182 - task: type: STS dataset: type: None name: MTEB STS22 (en) config: en split: test revision: eea2b4fe26a775864c896887d910b76a8098ad3f metrics: - type: cos_sim_pearson value: 62.36160793264433 - type: cos_sim_spearman value: 66.07767480051893 - type: euclidean_pearson value: 66.4716471304865 - type: euclidean_spearman value: 66.03999286501872 - type: manhattan_pearson value: 66.46197824372902 - type: manhattan_spearman value: 65.82936468127227 - task: type: STS dataset: type: None name: MTEB STSBenchmark config: default split: test revision: b0fddb56ed78048fa8b90373c8a3cfc37b684831 metrics: - type: cos_sim_pearson value: 85.27768996785856 - type: cos_sim_spearman value: 86.96704639052885 - type: euclidean_pearson value: 86.48753189555983 - type: euclidean_spearman value: 86.96981285751171 - type: manhattan_pearson value: 86.49262465015401 - type: manhattan_spearman value: 86.95378609580054 - task: type: Reranking dataset: type: None name: MTEB SciDocsRR config: default split: test revision: d3c5e1fc0b855ab6097bf1cda04dd73947d7caab metrics: - type: map value: 81.52012853393428 - type: mrr value: 94.70817671798063 - task: type: Retrieval dataset: type: None name: MTEB SciFact config: default split: test revision: 0228b52cf27578f30900b9e5271d331663a030d7 metrics: - type: map_at_1 value: 55.344 - type: map_at_10 value: 64.82900000000001 - type: map_at_100 value: 65.42 - type: map_at_1000 value: 65.443 - type: map_at_20 value: 65.2 - type: map_at_3 value: 61.8 - type: map_at_5 value: 63.510999999999996 - type: mrr_at_1 value: 58.333 - type: mrr_at_10 value: 66.24600000000001 - type: mrr_at_100 value: 66.742 - type: mrr_at_1000 value: 66.762 - type: mrr_at_20 value: 66.549 - type: mrr_at_3 value: 64.056 - type: mrr_at_5 value: 65.372 - type: ndcg_at_1 value: 58.333 - type: ndcg_at_10 value: 69.626 - type: ndcg_at_100 value: 72.236 - type: ndcg_at_1000 value: 72.872 - type: ndcg_at_20 value: 70.864 - type: ndcg_at_3 value: 64.50399999999999 - type: ndcg_at_5 value: 67.07600000000001 - type: precision_at_1 value: 58.333 - type: precision_at_10 value: 9.4 - type: precision_at_100 value: 1.073 - type: precision_at_1000 value: 0.11299999999999999 - type: precision_at_20 value: 4.983 - type: precision_at_3 value: 25.222 - type: precision_at_5 value: 16.8 - type: recall_at_1 value: 55.344 - type: recall_at_10 value: 82.789 - type: recall_at_100 value: 94.6 - type: recall_at_1000 value: 99.667 - type: recall_at_20 value: 87.533 - type: recall_at_3 value: 69.18299999999999 - type: recall_at_5 value: 75.622 - task: type: PairClassification dataset: type: None name: MTEB SprintDuplicateQuestions config: default split: test revision: d66bd1f72af766a5cc4b0ca5e00c162f89e8cc46 metrics: - type: cos_sim_accuracy value: 99.69405940594059 - type: cos_sim_ap value: 92.03642221694545 - type: cos_sim_f1 value: 84.06395048994327 - type: cos_sim_precision value: 86.79446219382322 - type: cos_sim_recall value: 81.5 - type: dot_accuracy value: 99.6930693069307 - type: dot_ap value: 91.9971441434875 - type: dot_f1 value: 83.8006230529595 - type: dot_precision value: 87.14902807775377 - type: dot_recall value: 80.7 - type: euclidean_accuracy value: 99.69504950495049 - type: euclidean_ap value: 92.03626548389335 - type: euclidean_f1 value: 84.10732714138285 - type: euclidean_precision value: 86.88699360341151 - type: euclidean_recall value: 81.5 - type: manhattan_accuracy value: 99.69504950495049 - type: manhattan_ap value: 92.02049659660081 - type: manhattan_f1 value: 84.34959349593495 - type: manhattan_precision value: 85.74380165289256 - type: manhattan_recall value: 83.0 - type: max_accuracy value: 99.69504950495049 - type: max_ap value: 92.03642221694545 - type: max_f1 value: 84.34959349593495 - task: type: Clustering dataset: type: None name: MTEB StackExchangeClustering config: default split: test revision: 6cbc1f7b2bc0622f2e39d2c77fa502909748c259 metrics: - type: v_measure value: 67.04916654680977 - type: v_measures value: [0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096] - task: type: Clustering dataset: type: None name: MTEB StackExchangeClusteringP2P config: default split: test revision: 815ca46b2622cec33ccafc3735d572c266efdb44 metrics: - type: v_measure value: 33.36641413495258 - type: v_measures value: [0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235] - task: type: Reranking dataset: type: None name: MTEB StackOverflowDupQuestions config: default split: test revision: e185fbe320c72810689fc5848eb6114e1ef5ec69 metrics: - type: map value: 51.19282080158746 - type: mrr value: 51.871100713012474 - task: type: Summarization dataset: type: None name: MTEB SummEval config: default split: test revision: cda12ad7615edc362dbf25a00fdd61d3b1eaf93c metrics: - type: cos_sim_pearson value: 31.437664703708485 - type: cos_sim_spearman value: 31.391119208581575 - type: dot_pearson value: 31.19925970504054 - type: dot_spearman value: 31.38087224016694 - task: type: Retrieval dataset: type: None name: MTEB TRECCOVID config: default split: test revision: bb9466bac8153a0349341eb1b22e06409e78ef4e metrics: - type: map_at_1 value: 0.249 - type: map_at_10 value: 2.163 - type: map_at_100 value: 13.242999999999999 - type: map_at_1000 value: 30.866 - type: map_at_20 value: 3.9539999999999997 - type: map_at_3 value: 0.718 - type: map_at_5 value: 1.169 - type: mrr_at_1 value: 96.0 - type: mrr_at_10 value: 98.0 - type: mrr_at_100 value: 98.0 - type: mrr_at_1000 value: 98.0 - type: mrr_at_20 value: 98.0 - type: mrr_at_3 value: 98.0 - type: mrr_at_5 value: 98.0 - type: ndcg_at_1 value: 92.0 - type: ndcg_at_10 value: 84.147 - type: ndcg_at_100 value: 65.143 - type: ndcg_at_1000 value: 56.038 - type: ndcg_at_20 value: 80.869 - type: ndcg_at_3 value: 89.11200000000001 - type: ndcg_at_5 value: 87.199 - type: precision_at_1 value: 96.0 - type: precision_at_10 value: 87.8 - type: precision_at_100 value: 66.72 - type: precision_at_1000 value: 24.684 - type: precision_at_20 value: 84.3 - type: precision_at_3 value: 94.0 - type: precision_at_5 value: 91.2 - type: recall_at_1 value: 0.249 - type: recall_at_10 value: 2.284 - type: recall_at_100 value: 16.025 - type: recall_at_1000 value: 52.068999999999996 - type: recall_at_20 value: 4.3180000000000005 - type: recall_at_3 value: 0.738 - type: recall_at_5 value: 1.212 - task: type: Retrieval dataset: type: None name: MTEB Touche2020 config: default split: test revision: a34f9a33db75fa0cbb21bb5cfc3dae8dc8bec93f metrics: - type: map_at_1 value: 3.4520000000000004 - type: map_at_10 value: 13.045000000000002 - type: map_at_100 value: 19.442 - type: map_at_1000 value: 21.09 - type: map_at_20 value: 15.667 - type: map_at_3 value: 7.409000000000001 - type: map_at_5 value: 9.73 - type: mrr_at_1 value: 46.939 - type: mrr_at_10 value: 60.295 - type: mrr_at_100 value: 60.904 - type: mrr_at_1000 value: 60.919000000000004 - type: mrr_at_20 value: 60.77 - type: mrr_at_3 value: 58.50300000000001 - type: mrr_at_5 value: 59.014 - type: ndcg_at_1 value: 44.897999999999996 - type: ndcg_at_10 value: 31.911 - type: ndcg_at_100 value: 41.945 - type: ndcg_at_1000 value: 53.181999999999995 - type: ndcg_at_20 value: 31.505 - type: ndcg_at_3 value: 39.745000000000005 - type: ndcg_at_5 value: 35.528999999999996 - type: precision_at_1 value: 46.939 - type: precision_at_10 value: 26.531 - type: precision_at_100 value: 8.163 - type: precision_at_1000 value: 1.559 - type: precision_at_20 value: 19.387999999999998 - type: precision_at_3 value: 40.136 - type: precision_at_5 value: 33.878 - type: recall_at_1 value: 3.4520000000000004 - type: recall_at_10 value: 18.899 - type: recall_at_100 value: 50.207 - type: recall_at_1000 value: 83.871 - type: recall_at_20 value: 26.756999999999998 - type: recall_at_3 value: 8.729000000000001 - type: recall_at_5 value: 12.084999999999999 - task: type: Classification dataset: type: None name: MTEB ToxicConversationsClassification config: default split: test revision: edfaf9da55d3dd50d43143d90c1ac476895ae6de metrics: - type: accuracy value: 67.4560546875 - type: ap value: 12.720403845355294 - type: f1 value: 51.76062666567839 - task: type: Classification dataset: type: None name: MTEB TweetSentimentExtractionClassification config: default split: test revision: d604517c81ca91fe16a244d1248fc021f9ecee7a metrics: - type: accuracy value: 62.36276174306734 - type: f1 value: 62.69956906934332 - task: type: Clustering dataset: type: None name: MTEB TwentyNewsgroupsClustering config: default split: test revision: 6125ec4e24fa026cec8a478383ee943acfbd5449 metrics: - type: v_measure value: 49.473492910233965 - type: v_measures value: [0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281] - task: type: PairClassification dataset: type: None name: MTEB TwitterSemEval2015 config: default split: test revision: 70970daeab8776df92f5ea462b6173c0b46fd2d1 metrics: - type: cos_sim_accuracy value: 85.75430649102938 - type: cos_sim_ap value: 73.62842656477649 - type: cos_sim_f1 value: 67.76023680315738 - type: cos_sim_precision value: 63.61741547012506 - type: cos_sim_recall value: 72.4802110817942 - type: dot_accuracy value: 85.7423854085951 - type: dot_ap value: 73.59147637253723 - type: dot_f1 value: 67.69498693867396 - type: dot_precision value: 64.03859731701577 - type: dot_recall value: 71.79419525065963 - type: euclidean_accuracy value: 85.7423854085951 - type: euclidean_ap value: 73.6288990409654 - type: euclidean_f1 value: 67.80415430267064 - type: euclidean_precision value: 63.79711493718009 - type: euclidean_recall value: 72.34828496042216 - type: manhattan_accuracy value: 85.69470107885796 - type: manhattan_ap value: 73.49219614602531 - type: manhattan_f1 value: 67.60809797550613 - type: manhattan_precision value: 64.22127255460589 - type: manhattan_recall value: 71.37203166226914 - type: max_accuracy value: 85.75430649102938 - type: max_ap value: 73.6288990409654 - type: max_f1 value: 67.80415430267064 - task: type: PairClassification dataset: type: None name: MTEB TwitterURLCorpus config: default split: test revision: 8b6510b0b1fa4e4c4f879467980e9be563ec1cdf metrics: - type: cos_sim_accuracy value: 89.08293553770326 - type: cos_sim_ap value: 86.21246419992926 - type: cos_sim_f1 value: 78.49922526377924 - type: cos_sim_precision value: 75.35769939084857 - type: cos_sim_recall value: 81.9140745303357 - type: dot_accuracy value: 89.08681647067955 - type: dot_ap value: 86.19733517196862 - type: dot_f1 value: 78.51132446157838 - type: dot_precision value: 75.70233755093287 - type: dot_recall value: 81.53680320295658 - type: euclidean_accuracy value: 89.07517367175069 - type: euclidean_ap value: 86.21198725320203 - type: euclidean_f1 value: 78.49867139061116 - type: euclidean_precision value: 75.38276155372839 - type: euclidean_recall value: 81.88327687095781 - type: manhattan_accuracy value: 89.0538285403811 - type: manhattan_ap value: 86.17785515765131 - type: manhattan_f1 value: 78.48184098593084 - type: manhattan_precision value: 74.34396308285694 - type: manhattan_recall value: 83.10748383122882 - type: max_accuracy value: 89.08681647067955 - type: max_ap value: 86.21246419992926 - type: max_f1 value: 78.51132446157838 license: apache-2.0 language: - en base_model: - answerdotai/ModernBERT-base - nomic-ai/modernbert-embed-unsupervised base_model_relation: finetune --- # ModernBERT Embed | Classification (12) | Clustering (11) | Pair Classification (3) | Reranking (4) | Retrieval (15) | STS (10) | Summarization (1) | |-----------------------|------------|--------------|---------------------|-----------------|-------------------------|---------------|----------------|-----------|------------------| | nomic-embed-text-v1 | 768 | 62.4 | 74.1 | 43.9 | **85.2** | 55.7 | 52.8 | 82.1 | 30.1 | | nomic-embed-text-v1.5 | 768 | 62.28 | 73.55 | 43.93 | 84.61 | 55.78 | **53.01** | **81.94** | 30.4 | | modernbert-embed-base | 768 | **62.62** | **74.31** | **44.98** | 83.96 | **56.42** | 52.89 | 81.78 | **31.39** | | nomic-embed-text-v1.5 | 256 | 61.04 | 72.1 | 43.16 | 84.09 | 55.18 | 50.81 | 81.34 | 30.05 | | modernbert-embed-base | 256 | 61.17 | 72.40 | 43.82 | 83.45 | 55.69 | 50.62 | 81.12 | 31.27 | ## Usage You can use these models directly with the latest transformers release and requires installing : Reminder, this model is trained similarly to Nomic Embed and **REQUIRES** prefixes to be added to the input. For more information, see the instructions in Nomic Embed. Most use cases, adding to the query and to the documents will be sufficient. ### Sentence Transformers Click to see Sentence Transformers usage with Matryoshka Truncation In Sentence Transformers, you can truncate embeddings to a smaller dimension by using the parameter when loading the model. Note the small differences compared to the full 768-dimensional similarities. ### Transformers Click to see Transformers usage with Matryoshka Truncation In , you can truncate embeddings to a smaller dimension by slicing the mean pooled embeddings, prior to normalization. Note the small differences compared to the full 768-dimensional similarities. ### Transformers.js If you haven't already, you can install the Transformers.js JavaScript library from NPM using: Then, you can compute embeddings as follows: ## Training Click the Nomic Atlas map below to visualize a 5M sample of our contrastive pretraining data!  \n- Retrieval (e.g., ArguAna) \n- Clustering (e.g., ArxivClusteringP2P) \n- Benchmarked on MTEB tasks \n\nComparison: \nOutperforms or matches baseline metrics in MTEB tasks like"
+}
\ No newline at end of file
diff --git a/model_data_json/nvidia_Cosmos-1.0-Diffusion-7B-Video2World.json b/model_data_json/nvidia_Cosmos-1.0-Diffusion-7B-Video2World.json
new file mode 100644
index 0000000000000000000000000000000000000000..3e837a815fe5d2cdd4656ec27566637e738942c0
--- /dev/null
+++ b/model_data_json/nvidia_Cosmos-1.0-Diffusion-7B-Video2World.json
@@ -0,0 +1,15 @@
+{
+ "model_id": "nvidia/Cosmos-1.0-Diffusion-7B-Video2World",
+ "downloads": 244460,
+ "tags": [
+ "cosmos",
+ "safetensors",
+ "nvidia",
+ "nemo",
+ "arxiv:2501.03575",
+ "license:other",
+ "region:us"
+ ],
+ "description": "--- license: other license_name: nvidia-open-model-license license_link: >- library_name: cosmos tags: - nvidia - nemo - cosmos - diffusers extra_gated_prompt: >- # NVIDIA Open Model License Agreement Version Release Date: January 6, 2025 This NVIDIA Open Model License Agreement (the \"Agreement\") is a legal agreement between the Legal Entity You represent, or if no entity is identified, You and NVIDIA Corporation and its Affiliates (\"NVIDIA\") and governs Your use of the Models that NVIDIA provides to You under this Agreement. NVIDIA and You are each a \"party\" and collectively the \"parties.\" NVIDIA models released under this Agreement are intended to be used permissively and enable the further development of AI technologies. Subject to the terms of this Agreement, NVIDIA confirms that: * Models are commercially usable. * You are free to create and distribute Derivative Models. * NVIDIA does not claim ownership to any outputs generated using the Models or Model Derivatives. By using, reproducing, modifying, distributing, performing or displaying any portion or element of the Model or Derivative Model, or otherwise accepting the terms of this Agreement, you agree to be bound by this Agreement. ## 1. Definitions The following definitions apply to this Agreement: 1.1. \"NVIDIA Cosmos Model\" means a multimodal Model shared under this Agreement. 1.2. \"Derivative Model\" means all (a) modifications to the Model, (b) works based on the Model, and (c) any other derivative works of the Model. An output is not a Derivative Model. 1.3. \"Legal Entity\" means the union of the acting entity and all other entities that control, are controlled by, or are under common control with that entity. For the purposes of this definition, \"control\" means (a) the power, direct or indirect, to cause the direction or management of such entity, whether by contract or otherwise, or (b) ownership of fifty percent (50%) or more of the outstanding shares, or (c) beneficial ownership of such entity. 1.4. \"Model\" means the machine learning model, software, checkpoints, learnt weights, algorithms, parameters, configuration files and documentation shared under this Agreement. 1.5. \"You\" or \"Your\" means an individual or Legal Entity exercising permissions granted by this Agreement. ## 2. Conditions for Use, License Grant, AI Ethics and IP Ownership 2.1. Conditions for Use. The Model and any Derivative Model are subject to additional terms as described in Section 2 and Section 3 of this Agreement and govern Your use. If You institute copyright or patent litigation against any entity (including a cross-claim or counterclaim in a lawsuit) alleging that the Model or a Derivative Model constitutes direct or contributory copyright or patent infringement, then any licenses granted to You under this Agreement for that Model or Derivative Model will terminate as of the date such litigation is filed. If You bypass, disable, reduce the efficacy of, or circumvent any technical limitation, safety guardrail or associated safety guardrail hyperparameter, encryption, security, digital rights management, or authentication mechanism contained in the Model, your rights under this Agreement will automatically terminate. NVIDIA may update this Agreement to comply with legal and regulatory requirements at any time and You agree to either comply with any updated license or cease Your copying, use, and distribution of the Model and any Derivative Model. 2.2. License Grant. The rights granted herein are explicitly conditioned on Your full compliance with the terms of this Agreement. Subject to the terms and conditions of this Agreement, NVIDIA hereby grants to You a perpetual, worldwide, non-exclusive, no-charge, royalty-free, revocable (as stated in Section 2.1) license to publicly perform, publicly display, reproduce, use, create derivative works of, make, have made, sell, offer for sale, distribute (through multiple tiers of distribution) and import the Model. 2.3. AI Ethics. Use of the Models under the Agreement must be consistent with NVIDIA's Trustworthy AI terms found at 2.4. NVIDIA owns the Model and any Model Derivatives created by NVIDIA. Subject to NVIDIA's underlying ownership rights in the Model or its Model Derivatives, You are and will be the owner of Your Model Derivatives. NVIDIA claims no ownership rights in outputs. You are responsible for outputs and their subsequent uses. Except as expressly granted in this Agreement, (a) NVIDIA reserves all rights, interests and remedies in connection with the Model and (b) no other license or right is granted to you by implication, estoppel or otherwise. ## 3. Redistribution You may reproduce and distribute copies of the Model or Derivative Models thereof in any medium, with or without modifications, provided that You meet the following conditions: 3.1. If you distribute the Model, You must give any other recipients of the Model a copy of this Agreement and include the following attribution notice within a \"Notice\" text file with such copies: \"Licensed by NVIDIA Corporation under the NVIDIA Open Model License\"; 3.2. If you distribute or make available a NVIDIA Cosmos Model, or a product or service (including an AI model) that contains or uses a NVIDIA Cosmos Model, use a NVIDIA Cosmos Model to create a Derivative Model, or use a NVIDIA Cosmos Model or its outputs to create, train, fine tune, or otherwise improve an AI model, you will include \"Built on NVIDIA Cosmos\" on a related website, user interface, blogpost, about page, or product documentation; and 3.3. You may add Your own copyright statement to Your modifications and may provide additional or different license terms and conditions for use, reproduction, or distribution of Your modifications, or for any such Derivative Models as a whole, provided Your use, reproduction, and distribution of the Model otherwise complies with the conditions stated in this Agreement. ## 4. Trademarks This Agreement does not grant permission to use the trade names, trademarks, service marks, or product names of NVIDIA, except as required for reasonable and customary use in describing the origin of the Model and reproducing the content of the \"Notice\" text file. ## **5. Disclaimer of Warranty** **Unless required by applicable law or agreed to in writing, NVIDIA provides the Model on an \"AS IS\" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied, including, without limitation, any warranties or conditions of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A PARTICULAR PURPOSE. You are solely responsible for determining the appropriateness of using or redistributing the Model, Derivative Models and outputs and assume any risks associated with Your exercise of permissions under this Agreement.** ## **6. Limitation of Liability** **In no event and under no legal theory, whether in tort (including negligence), contract, or otherwise, unless required by applicable law (such as deliberate and grossly negligent acts) or agreed to in writing, will NVIDIA be liable to You for damages, including any direct, indirect, special, incidental, or consequential damages of any character arising as a result of this Agreement or out of the use or inability to use the Model, Derivative Models or outputs (including but not limited to damages for loss of goodwill, work stoppage, computer failure or malfunction, or any and all other commercial damages or losses), even if NVIDIA has been advised of the possibility of such damages.** ## 7. Indemnity You will indemnify and hold harmless NVIDIA from and against any claim by any third party arising out of or related to your use or distribution of the Model, Model Derivatives or outputs. ## 8. Feedback NVIDIA appreciates your feedback, and You agree that NVIDIA may use it without restriction or compensation to You. ## 9. Governing Law This Agreement will be governed in all respects by the laws of the United States and the laws of the State of Delaware, without regard to conflict of laws principles or the United Nations Convention on Contracts for the International Sale of Goods. The state and federal courts residing in Santa Clara County, California will have exclusive jurisdiction over any dispute or claim arising out of or related to this Agreement, and the parties irrevocably consent to personal jurisdiction and venue in those courts; except that, either party may apply for injunctive remedies or an equivalent type of urgent legal relief in any jurisdiction. ## 10. Trade and Compliance You agree to comply with all applicable export, import, trade and economic sanctions laws and regulations, as amended, including without limitation U.S. Export Administration Regulations and Office of Foreign Assets Control regulations. These laws include restrictions on destinations, end-users and end-use. extra_gated_fields: By clicking Submit below, I accept the terms of the NVIDIA Open Model License Agreement and acknowledge that I am an adult of legal age of majority in the country in which the Cosmos Models will be used and have authority to accept this Agreement: checkbox extra_gated_description: >- The information you provide will be collected, stored, processed and shared in accordance with the NVIDIA Privacy Policy. extra_gated_button_content: Submit --- # **Cosmos-1.0-Diffusion**: A Suite of Diffusion-based World Foundation Models **Cosmos** | **Code** | **Paper** | **Paper Website** # Model Overview ## Description: **Cosmos World Foundation Models**: A family of highly performant pre-trained world foundation models purpose-built for generating physics-aware videos and world states for physical AI development. The Cosmos diffusion models are a collection of diffusion based world foundation models that generate dynamic, high quality videos from text, image, or video inputs. It can serve as the building block for various applications or research that are related to world generation. The models are ready for commercial use under NVIDIA Open Model license agreement. **Model Developer**: NVIDIA ## Model Versions In Cosmos 1.0 release, the Cosmos Diffusion WFM family includes the following models: - Cosmos-1.0-Diffusion-7B-Text2World - Given a text description, predict an output video of 121 frames. - Cosmos-1.0-Diffusion-14B-Text2World - Given a text description, predict an output video of 121 frames. - Cosmos-1.0-Diffusion-7B-Video2World - Given a text description and an image as the first frame, predict the future 120 frames. - Cosmos-1.0-Diffusion-14B-Video2World - Given a text description and an image as the first frame, predict the future 120 frames. ### License: This model is released under the NVIDIA Open Model License. For a custom license, please contact cosmos-license@nvidia.com. Under the NVIDIA Open Model License, NVIDIA confirms: * Models are commercially usable. * You are free to create and distribute Derivative Models. * NVIDIA does not claim ownership to any outputs generated using the Models or Derivative Models. **Important Note**: If you bypass, disable, reduce the efficacy of, or circumvent any technical limitation, **safety guardrail** or associated safety guardrail hyperparameter, encryption, security, digital rights management, or authentication mechanism contained in the Model, your rights under NVIDIA Open Model License Agreement will automatically terminate. * Cosmos-1.0-Guardrail is the safety guardrail for this model. ## Model Architecture: Cosmos-1.0-Diffusion-7B-Video2World is a diffusion transformer model designed for video denoising in the latent space. The network is composed of interleaved self-attention, cross-attention and feedforward layers as its building blocks. The cross-attention layers allow the model to condition on input text throughout the denoising process. Before each layers, adaptive layer normalization is applied to embed the time information for denoising. When image or video is provided as input, their latent frames are concatenated with the generated frames along the temporal dimension. Augment noise is added to conditional latent frames to bridge the training and inference gap. ## Input/Output Specifications * **Input** * **Input Type(s)**: Text+Image, Text+Video * **Input Format(s)**: * Text: String * Image: jpg, png, jpeg, webp * Video: mp4 * **Input Parameters**: * Text: One-dimensional (1D) * Image: Two-dimensional (2D) * Video: Three-dimensional (3D) * **Other Properties Related to Input**: * The input string should contain fewer than 300 words and should provide descriptive content for world generation, such as a scene description, key objects or characters, background, and any specific actions or motions to be depicted within the 5-second duration. * The input image should be of 1280x704 resolution. * The input video should be of 1280x704 resolution and 9 input frames. * **Output** * **Output Type(s)**: Video * **Output Format(s)**: mp4 * **Output Parameters**: Three-dimensional (3D) * **Other Properties Related to Output**: By default, the generated video is a 5-second clip with a resolution of 1280x704 pixels and a frame rate of 24 frames per second (fps). The video content visualizes the input text description as a short animated scene, capturing key elements within the specified time constraints. Aspect ratios and resolutions are configurable, with options including 1:1 (960x960 pixels), 4:3 (960x704 pixels), 3:4 (704x960 pixels), 16:9 (1280x704 pixels), and 9:16 (704x1280 pixels). The frame rate is also adjustable within a range of 12 to 40 fps. ## Software Integration **Runtime Engine(s):** * Cosmos * Diffusers **Supported Hardware Microarchitecture Compatibility:** * NVIDIA Blackwell * NVIDIA Hopper * NVIDIA Ampere **Note**: We have only tested doing inference with BF16 precision. **Operating System(s):** * Linux (We have not tested on other operating systems.) # Usage * See Cosmos for details. Cosmos can also be used with Diffusers! # Evaluation Please see our technical paper for detailed evaluations. ## Inference Time and GPU Memory Usage The numbers provided below may vary depending on system specs and are for reference only. | Offloading Strategy | 7B Video2World | 14B Video2World | |----------------------------------------------------------------------------------|---------|---------| | Offload prompt upsampler | 76.5 GB | > 80.0 GB | | Offload prompt upsampler & guardrails | 59.9 GB | 73.3 GB | | Offload prompt upsampler & guardrails & T5 encoder | 41.3 GB | 54.8 GB | | Offload prompt upsampler & guardrails & T5 encoder & tokenizer | 41.1 GB | 54.5 GB | | Offload prompt upsampler & guardrails & T5 encoder & tokenizer & diffusion model | 27.3 GB | 39.0 GB | The following table shows the end-to-end inference runtime on a single H100 GPU, excluding model initialization time: | 7B Video2World (offload prompt upsampler) | 14B Video2World (offload prompt upsampler, guardrails) | |---------|---------| | ~383 seconds | ~593 seconds | ## Ethical Considerations NVIDIA believes Trustworthy AI is a shared responsibility and we have established policies and practices to enable development for a wide array of AI applications. When downloaded or used in accordance with our terms of service, developers should work with their internal model team to ensure this model meets requirements for the relevant industry and use case and addresses unforeseen product misuse. For more detailed information on ethical considerations for this model, please see the subcards of Explainability, Bias, Safety & Security, and Privacy below. Please report security vulnerabilities or NVIDIA AI Concerns here. ### Plus Plus (++) Promise We value you, the datasets, the diversity they represent, and what we have been entrusted with. This model and its associated data have been: * Verified to comply with current applicable disclosure laws, regulations, and industry standards. * Verified to comply with applicable privacy labeling requirements. * Annotated to describe the collector/source (NVIDIA or a third-party). * Characterized for technical limitations. * Reviewed to ensure proper disclosure is accessible to, maintained for, and in compliance with NVIDIA data subjects and their requests. * Reviewed before release. * Tagged for known restrictions and potential safety implications. ### Bias Field | Response :---------------------------------------------------------------------------------------------------|:--------------- Participation considerations from adversely impacted groups protected classes in model design and testing: | None Measures taken to mitigate against unwanted bias: | None ### Explainability Field | Response :------------------------------------------------------------------------------------------------------|:--------------------------------------------------------------------------------- Intended Application & Domain: | World Generation Model Type: | Transformer Intended Users: | Physical AI developers Output: | Videos Describe how the model works: | Generates videos based on video inputs Technical Limitations: | The model may not follow the video input accurately. Verified to have met prescribed NVIDIA quality standards: | Yes Performance Metrics: | Quantitative and Qualitative Evaluation Potential Known Risks: | The model's output can generate all forms of videos, including what may be considered toxic, offensive, or indecent. Licensing: | NVIDIA Open Model License ### Privacy Field | Response :----------------------------------------------------------------------------------------------------------------------------------|:----------------------------------------------- Generatable or reverse engineerable personal information? | None Known Protected class data used to create this model? | None Known Was consent obtained for any personal data used? | None Known How often is dataset reviewed? | Before Release Is a mechanism in place to honor data subject right of access or deletion of personal data? | Not Applicable If personal data was collected for the development of the model, was it collected directly by NVIDIA? | Not Applicable If personal data was collected for the development of the model by NVIDIA, do you maintain or have access to disclosures made to data subjects? | Not Applicable If personal data was collected for the development of this AI model, was it minimized to only what was required? | Not Applicable Is there provenance for all datasets used in training? | Yes Does data labeling (annotation, metadata) comply with privacy laws? | Yes Is data compliant with data subject requests for data correction or removal, if such a request was made? | Not Applicable ### Safety Field | Response :---------------------------------------------------|:---------------------------------- Model Application(s): | World Generation Describe the life critical impact (if present). | None Known Use Case Restrictions: | NVIDIA Open Model License Model and dataset restrictions: | The Principle of least privilege (PoLP) is applied limiting access for dataset generation and model development. Restrictions enforce dataset access during training, and dataset license constraints adhered to. Model checkpoints are made available on Hugging Face, and may become available on cloud providers' model catalog.",
+ "model_explanation_gemini": "Generates video-to-world simulations using diffusion-based multimodal AI under NVIDIA's permissive open-model license.\n\n**Features:** \n- Video-to-world simulation via diffusion \n- Multimodal capabilities \n- Commercially usable under NVIDIA's license \n- Supports derivative model creation \n- No ownership claims on generated outputs \n\n**Comparison:** \nUnlike standard video generation models, this one focuses on simulating dynamic \"world\" environments from video inputs, leveraging NVIDIA's Cosmos framework for multimodal diffusion. It offers broader commercial"
+}
\ No newline at end of file
diff --git a/model_data_json/nvidia_mit-b0.json b/model_data_json/nvidia_mit-b0.json
new file mode 100644
index 0000000000000000000000000000000000000000..8fddd8c34d78748bbb0f76b669b0d5e188b0f1f8
--- /dev/null
+++ b/model_data_json/nvidia_mit-b0.json
@@ -0,0 +1,20 @@
+{
+ "model_id": "nvidia/mit-b0",
+ "downloads": 77331,
+ "tags": [
+ "transformers",
+ "pytorch",
+ "tf",
+ "segformer",
+ "image-classification",
+ "vision",
+ "dataset:imagenet_1k",
+ "arxiv:2105.15203",
+ "license:other",
+ "autotrain_compatible",
+ "endpoints_compatible",
+ "region:us"
+ ],
+ "description": "--- license: other tags: - vision datasets: - imagenet_1k widget: - src: example_title: House - src: example_title: Castle --- # SegFormer (b0-sized) encoder pre-trained-only SegFormer encoder fine-tuned on Imagenet-1k. It was introduced in the paper SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers by Xie et al. and first released in this repository. Disclaimer: The team releasing SegFormer did not write a model card for this model so this model card has been written by the Hugging Face team. ## Model description SegFormer consists of a hierarchical Transformer encoder and a lightweight all-MLP decode head to achieve great results on semantic segmentation benchmarks such as ADE20K and Cityscapes. The hierarchical Transformer is first pre-trained on ImageNet-1k, after which a decode head is added and fine-tuned altogether on a downstream dataset. This repository only contains the pre-trained hierarchical Transformer, hence it can be used for fine-tuning purposes. ## Intended uses & limitations You can use the model for fine-tuning of semantic segmentation. See the model hub to look for fine-tuned versions on a task that interests you. ### How to use Here is how to use this model to classify an image of the COCO 2017 dataset into one of the 1,000 ImageNet classes: For more code examples, we refer to the documentation. ### License The license for this model can be found here. ### BibTeX entry and citation info",
+ "model_explanation_gemini": "Pre-trained SegFormer (b0-sized) hierarchical Transformer encoder for fine-tuning on semantic segmentation tasks.\n\nFeatures: \n- Hierarchical Transformer encoder \n- Pre-trained on ImageNet-1k \n- Designed for semantic segmentation (requires adding MLP decode head for downstream use) \n- Lightweight architecture \n\nComparison: \nUnlike full SegFormer models that include both encoder and MLP decode head, this variant provides only the pre-trained encoder component for custom fine-tuning."
+}
\ No newline at end of file
diff --git a/model_data_json/nvidia_parakeet-rnnt-0.6b.json b/model_data_json/nvidia_parakeet-rnnt-0.6b.json
new file mode 100644
index 0000000000000000000000000000000000000000..d643c4d0d1a7fc22906fd6e851baa1a0e14816bf
--- /dev/null
+++ b/model_data_json/nvidia_parakeet-rnnt-0.6b.json
@@ -0,0 +1,36 @@
+{
+ "model_id": "nvidia/parakeet-rnnt-0.6b",
+ "downloads": 73324,
+ "tags": [
+ "nemo",
+ "automatic-speech-recognition",
+ "speech",
+ "audio",
+ "Transducer",
+ "FastConformer",
+ "Conformer",
+ "pytorch",
+ "NeMo",
+ "hf-asr-leaderboard",
+ "en",
+ "dataset:librispeech_asr",
+ "dataset:fisher_corpus",
+ "dataset:Switchboard-1",
+ "dataset:WSJ-0",
+ "dataset:WSJ-1",
+ "dataset:National-Singapore-Corpus-Part-1",
+ "dataset:National-Singapore-Corpus-Part-6",
+ "dataset:vctk",
+ "dataset:voxpopuli",
+ "dataset:europarl",
+ "dataset:multilingual_librispeech",
+ "dataset:mozilla-foundation/common_voice_8_0",
+ "dataset:MLCommons/peoples_speech",
+ "arxiv:2305.05084",
+ "license:cc-by-4.0",
+ "model-index",
+ "region:us"
+ ],
+ "description": "--- language: - en library_name: nemo datasets: - librispeech_asr - fisher_corpus - Switchboard-1 - WSJ-0 - WSJ-1 - National-Singapore-Corpus-Part-1 - National-Singapore-Corpus-Part-6 - vctk - voxpopuli - europarl - multilingual_librispeech - mozilla-foundation/common_voice_8_0 - MLCommons/peoples_speech thumbnail: null tags: - automatic-speech-recognition - speech - audio - Transducer - FastConformer - Conformer - pytorch - NeMo - hf-asr-leaderboard license: cc-by-4.0 widget: - example_title: Librispeech sample 1 src: - example_title: Librispeech sample 2 src: model-index: - name: parakeet-rnnt-0.6b results: - task: name: Automatic Speech Recognition type: automatic-speech-recognition dataset: name: AMI (Meetings test) type: edinburghcstr/ami config: ihm split: test args: language: en metrics: - name: Test WER type: wer value: 17.55 - task: name: Automatic Speech Recognition type: automatic-speech-recognition dataset: name: Earnings-22 type: revdotcom/earnings22 split: test args: language: en metrics: - name: Test WER type: wer value: 14.78 - task: name: Automatic Speech Recognition type: automatic-speech-recognition dataset: name: GigaSpeech type: speechcolab/gigaspeech split: test args: language: en metrics: - name: Test WER type: wer value: 10.07 - task: name: Automatic Speech Recognition type: automatic-speech-recognition dataset: name: LibriSpeech (clean) type: librispeech_asr config: other split: test args: language: en metrics: - name: Test WER type: wer value: 1.63 - task: name: Automatic Speech Recognition type: automatic-speech-recognition dataset: name: LibriSpeech (other) type: librispeech_asr config: other split: test args: language: en metrics: - name: Test WER type: wer value: 3.06 - task: type: Automatic Speech Recognition name: automatic-speech-recognition dataset: name: SPGI Speech type: kensho/spgispeech config: test split: test args: language: en metrics: - name: Test WER type: wer value: 3.47 - task: type: Automatic Speech Recognition name: automatic-speech-recognition dataset: name: tedlium-v3 type: LIUM/tedlium config: release1 split: test args: language: en metrics: - name: Test WER type: wer value: 3.86 - task: name: Automatic Speech Recognition type: automatic-speech-recognition dataset: name: Vox Populi type: facebook/voxpopuli config: en split: test args: language: en metrics: - name: Test WER type: wer value: 6.05 - task: type: Automatic Speech Recognition name: automatic-speech-recognition dataset: name: Mozilla Common Voice 9.0 type: mozilla-foundation/common_voice_9_0 config: en split: test args: language: en metrics: - name: Test WER type: wer value: 8.07 metrics: - wer pipeline_tag: automatic-speech-recognition --- # Parakeet RNNT 0.6B (en) | | is an ASR model that transcribes speech in lower case English alphabet. This model is jointly developed by NVIDIA NeMo and Suno.ai teams. It is an XL version of FastConformer Transducer [1] (around 600M parameters) model. See the model architecture section and NeMo documentation for complete architecture details. ## NVIDIA NeMo: Training To train, fine-tune or play with the model you will need to install NVIDIA NeMo. We recommend you install it after you've installed latest PyTorch version. ## How to Use this Model The model is available for use in the NeMo toolkit [3], and can be used as a pre-trained checkpoint for inference or for fine-tuning on another dataset. ### Automatically instantiate the model ### Transcribing using Python First, let's get a sample Then simply do: ### Transcribing many audio files ### Input This model accepts 16000 Hz mono-channel audio (wav files) as input. ### Output This model provides transcribed speech as a string for a given audio sample. ## Model Architecture FastConformer [1] is an optimized version of the Conformer model with 8x depthwise-separable convolutional downsampling. The model is trained in a multitask setup with a Transducer decoder (RNNT) loss. You may find more information on the details of FastConformer here: Fast-Conformer Model. ## Training The NeMo toolkit [3] was used for training the models for over several hundred epochs. These model are trained with this example script and this base config. The tokenizers for these models were built using the text transcripts of the train set with this script. ### Datasets The model was trained on 64K hours of English speech collected and prepared by NVIDIA NeMo and Suno teams. The training dataset consists of private subset with 40K hours of English speech plus 24K hours from the following public datasets: - Librispeech 960 hours of English speech - Fisher Corpus - Switchboard-1 Dataset - WSJ-0 and WSJ-1 - National Speech Corpus (Part 1, Part 6) - VCTK - VoxPopuli (EN) - Europarl-ASR (EN) - Multilingual Librispeech (MLS EN) - 2,000 hour subset - Mozilla Common Voice (v7.0) - People's Speech - 12,000 hour subset ## Performance The performance of Automatic Speech Recognition models is measuring using Word Error Rate. Since this dataset is trained on multiple domains and a much larger corpus, it will generally perform better at transcribing audio in general. The following tables summarizes the performance of the available models in this collection with the Transducer decoder. Performances of the ASR models are reported in terms of Word Error Rate (WER%) with greedy decoding. |**Version**|**Tokenizer**|**Vocabulary Size**|**AMI**|**Earnings-22**|**Giga Speech**|**LS test-clean**|**SPGI Speech**|**TEDLIUM-v3**|**Vox Populi**|**Common Voice**| |---------|-----------------------|-----------------|---------------|---------------|------------|-----------|-----|-------|------|------| | 1.22.0 | SentencePiece Unigram | 1024 | 17.55 | 14.78 | 10.07 | 1.63 | 3.06 | 3.47 | 3.86 | 6.05 | 8.07 | These are greedy WER numbers without external LM. More details on evaluation can be found at HuggingFace ASR Leaderboard ## NVIDIA Riva: Deployment NVIDIA Riva, is an accelerated speech AI SDK deployable on-prem, in all clouds, multi-cloud, hybrid, on edge, and embedded. Additionally, Riva provides: * World-class out-of-the-box accuracy for the most common languages with model checkpoints trained on proprietary data with hundreds of thousands of GPU-compute hours * Best in class accuracy with run-time word boosting (e.g., brand and product names) and customization of acoustic model, language model, and inverse text normalization * Streaming speech recognition, Kubernetes compatible scaling, and enterprise-grade support. Although this model isn’t supported yet by Riva, the list of supported models is here. Check out Riva live demo. ## References [1] Fast Conformer with Linearly Scalable Attention for Efficient Speech Recognition [2] Google Sentencepiece Tokenizer [3] NVIDIA NeMo Toolkit [4] Suno.ai [5] HuggingFace ASR Leaderboard ## Licence License to use this model is covered by the CC-BY-4.0. By downloading the public and release version of the model, you accept the terms and conditions of the CC-BY-4.0 license.",
+ "model_explanation_gemini": "Performs automatic speech recognition (ASR) to transcribe English speech into lowercase text using a FastConformer Transducer architecture with ~600M parameters. \n\n**Features**: \n- Language: English \n- Architecture: FastConformer Transducer (RNNT) \n- Parameters: ~600M (XL size) \n- Input: 16kHz mono-channel audio (WAV) \n- Output: Lowercase transcribed text \n- Datasets: Trained on LibriSpeech, Fisher Corpus"
+}
\ No newline at end of file
diff --git a/model_data_json/onnx-community_Kokoro-82M-v1.0-ONNX.json b/model_data_json/onnx-community_Kokoro-82M-v1.0-ONNX.json
new file mode 100644
index 0000000000000000000000000000000000000000..7250e46afa039b8e3e8487b210169581a037c506
--- /dev/null
+++ b/model_data_json/onnx-community_Kokoro-82M-v1.0-ONNX.json
@@ -0,0 +1,17 @@
+{
+ "model_id": "onnx-community/Kokoro-82M-v1.0-ONNX",
+ "downloads": 82437,
+ "tags": [
+ "transformers.js",
+ "onnx",
+ "style_text_to_speech_2",
+ "text-to-speech",
+ "en",
+ "base_model:hexgrad/Kokoro-82M",
+ "base_model:quantized:hexgrad/Kokoro-82M",
+ "license:apache-2.0",
+ "region:us"
+ ],
+ "description": "--- license: apache-2.0 library_name: transformers.js language: - en base_model: - hexgrad/Kokoro-82M pipeline_tag: text-to-speech --- # Kokoro TTS Kokoro is a frontier TTS model for its size of 82 million parameters (text in/audio out). ## Table of contents - Usage - JavaScript - Python - Voices/Samples - Quantizations ## Usage ### JavaScript First, install the library from NPM using: You can then generate speech as follows: ### Python Optionally, save the audio to a file: ## Voices/Samples > Life is like a box of chocolates. You never know what you're gonna get. | Name | Nationality | Gender | Sample | | ------------ | ----------- | ------ | --------------------------------------------------------------------------------------------------------------------------------------- | | **af_heart** | American | Female | Haoran Wei*, Chenglong Liu*, Jinyue Chen, Jia Wang, Lingyu Kong, Yanming Xu, Zheng Ge, Liang Zhao, Jianjian Sun, Yuang Peng, Chunrui Han, Xiangyu Zhang !image/jpeg ## Usage Inference using Huggingface transformers on NVIDIA GPUs. Requirements tested on python 3.10: More details about 'ocr_type', 'ocr_box', 'ocr_color', and 'render' can be found at our GitHub. Our training codes are available at our GitHub. ## More Multimodal Projects 👏 Welcome to explore more multimodal projects of our team: Vary | Fox | OneChart ## Citation If you find our work helpful, please consider citing our papers 📝 and liking this project ❤️!",
+ "model_explanation_gemini": "Performs multilingual image-to-text OCR (Optical Character Recognition) with end-to-end unified processing, supporting custom configurations like text type, bounding boxes, and colors.\n\nModel Features: \n- Multilingual image-text-to-text conversion \n- End-to-end unified OCR processing \n- Customizable OCR parameters (text type, bounding boxes, colors) \n- GPU-optimized inference \n- Open-source training code \n\nComparison: \nThis model positions itself as \"OCR-2.0\" with unified end"
+}
\ No newline at end of file
diff --git a/model_data_json/tasksource_ModernBERT-large-nli.json b/model_data_json/tasksource_ModernBERT-large-nli.json
new file mode 100644
index 0000000000000000000000000000000000000000..0c8106f97d7ce7bc165fadddab486ed2b164331e
--- /dev/null
+++ b/model_data_json/tasksource_ModernBERT-large-nli.json
@@ -0,0 +1,25 @@
+{
+ "model_id": "tasksource/ModernBERT-large-nli",
+ "downloads": 82059,
+ "tags": [
+ "transformers",
+ "safetensors",
+ "modernbert",
+ "text-classification",
+ "instruct",
+ "natural-language-inference",
+ "nli",
+ "zero-shot-classification",
+ "en",
+ "dataset:nyu-mll/glue",
+ "dataset:facebook/anli",
+ "base_model:answerdotai/ModernBERT-large",
+ "base_model:finetune:answerdotai/ModernBERT-large",
+ "license:apache-2.0",
+ "autotrain_compatible",
+ "endpoints_compatible",
+ "region:us"
+ ],
+ "description": "--- library_name: transformers base_model: - answerdotai/ModernBERT-large license: apache-2.0 language: - en pipeline_tag: zero-shot-classification datasets: - nyu-mll/glue - facebook/anli tags: - instruct - natural-language-inference - nli --- # Model Card for Model ID This model is ModernBERT multi-task fine-tuned on tasksource NLI tasks, including MNLI, ANLI, SICK, WANLI, doc-nli, LingNLI, FOLIO, FOL-NLI, LogicNLI, Label-NLI and all datasets in the below table). This is the equivalent of an \"instruct\" version. The model was trained for 200k steps on an Nvidia A30 GPU. It is very good at reasoning tasks (better than llama 3.1 8B Instruct on ANLI and FOLIO), long context reasoning, sentiment analysis and zero-shot classification with new labels. The following table shows model test accuracy. These are the scores for the same single transformer with different classification heads on top. Further gains can be obtained by fine-tuning on a single-task, e.g. SST, but it this checkpoint is great for zero-shot classification and natural language inference (contradiction/entailment/neutral classification). | test_name | test_accuracy | |:--------------------------------------|----------------:| | glue/mnli | 0.89 | | glue/qnli | 0.96 | | glue/rte | 0.91 | | glue/wnli | 0.64 | | glue/mrpc | 0.81 | | glue/qqp | 0.87 | | glue/cola | 0.87 | | glue/sst2 | 0.96 | | super_glue/boolq | 0.66 | | super_glue/cb | 0.86 | | super_glue/multirc | 0.9 | | super_glue/wic | 0.71 | | super_glue/axg | 1 | | anli/a1 | 0.72 | | anli/a2 | 0.54 | | anli/a3 | 0.55 | | sick/label | 0.91 | | sick/entailment_AB | 0.93 | | snli | 0.94 | | scitail/snli_format | 0.95 | | hans | 1 | | WANLI | 0.77 | | recast/recast_ner | 0.85 | | recast/recast_sentiment | 0.97 | | recast/recast_verbnet | 0.89 | | recast/recast_megaveridicality | 0.87 | | recast/recast_verbcorner | 0.87 | | recast/recast_kg_relations | 0.9 | | recast/recast_factuality | 0.95 | | recast/recast_puns | 0.98 | | probability_words_nli/reasoning_1hop | 1 | | probability_words_nli/usnli | 0.79 | | probability_words_nli/reasoning_2hop | 0.98 | | nan-nli | 0.85 | | nli_fever | 0.78 | | breaking_nli | 0.99 | | conj_nli | 0.72 | | fracas | 0.79 | | dialogue_nli | 0.94 | | mpe | 0.75 | | dnc | 0.91 | | recast_white/fnplus | 0.76 | | recast_white/sprl | 0.9 | | recast_white/dpr | 0.84 | | add_one_rte | 0.94 | | paws/labeled_final | 0.96 | | pragmeval/pdtb | 0.56 | | lex_glue/scotus | 0.58 | | lex_glue/ledgar | 0.85 | | dynasent/dynabench.dynasent.r1.all/r1 | 0.83 | | dynasent/dynabench.dynasent.r2.all/r2 | 0.76 | | cycic_classification | 0.96 | | lingnli | 0.91 | | monotonicity-entailment | 0.97 | | scinli | 0.88 | | naturallogic | 0.93 | | dynahate | 0.86 | | syntactic-augmentation-nli | 0.94 | | autotnli | 0.92 | | defeasible-nli/atomic | 0.83 | | defeasible-nli/snli | 0.8 | | help-nli | 0.96 | | nli-veridicality-transitivity | 0.99 | | lonli | 0.99 | | dadc-limit-nli | 0.79 | | folio | 0.71 | | tomi-nli | 0.54 | | puzzte | 0.59 | | temporal-nli | 0.93 | | counterfactually-augmented-snli | 0.81 | | cnli | 0.9 | | boolq-natural-perturbations | 0.72 | | equate | 0.65 | | logiqa-2.0-nli | 0.58 | | mindgames | 0.96 | | ConTRoL-nli | 0.66 | | logical-fallacy | 0.38 | | cladder | 0.89 | | conceptrules_v2 | 1 | | zero-shot-label-nli | 0.79 | | scone | 1 | | monli | 1 | | SpaceNLI | 1 | | propsegment/nli | 0.92 | | FLD.v2/default | 0.91 | | FLD.v2/star | 0.78 | | SDOH-NLI | 0.99 | | scifact_entailment | 0.87 | | feasibilityQA | 0.79 | | AdjectiveScaleProbe-nli | 1 | | resnli | 1 | | semantic_fragments_nli | 1 | | dataset_train_nli | 0.95 | | nlgraph | 0.97 | | ruletaker | 0.99 | | PARARULE-Plus | 1 | | logical-entailment | 0.93 | | nope | 0.56 | | LogicNLI | 0.91 | | contract-nli/contractnli_a/seg | 0.88 | | contract-nli/contractnli_b/full | 0.84 | | nli4ct_semeval2024 | 0.72 | | biosift-nli | 0.92 | | SIGA-nli | 0.57 | | FOL-nli | 0.79 | | doc-nli | 0.81 | | mctest-nli | 0.92 | | natural-language-satisfiability | 0.92 | | idioms-nli | 0.83 | | lifecycle-entailment | 0.79 | | MSciNLI | 0.84 | | hover-3way/nli | 0.92 | | seahorse_summarization_evaluation | 0.81 | | missing-item-prediction/contrastive | 0.88 | | Pol_NLI | 0.93 | | synthetic-retrieval-NLI/count | 0.72 | | synthetic-retrieval-NLI/position | 0.9 | | synthetic-retrieval-NLI/binary | 0.92 | | babi_nli | 0.98 | # Usage ## [ZS] Zero-shot classification pipeline NLI training data of this model includes label-nli, a NLI dataset specially constructed to improve this kind of zero-shot classification. ## [NLI] Natural language inference pipeline ## Backbone for further fune-tuning This checkpoint has stronger reasoning and fine-grained abilities than the base version and can be used for further fine-tuning. # Citation",
+ "model_explanation_gemini": "A multi-task fine-tuned ModernBERT model specializing in zero-shot classification, natural language inference (NLI), and reasoning tasks, outperforming models like Llama 3.1 8B Instruct on benchmarks like ANLI and FOLIO. \n\n**Features**: \n- Zero-shot classification with new labels \n- Natural language inference (contradiction/entailment/neutral classification) \n- Strong performance on reasoning tasks (long-context, logical, and sentiment analysis) \n- Multi"
+}
\ No newline at end of file
diff --git a/model_data_json/tiiuae_falcon-7b.json b/model_data_json/tiiuae_falcon-7b.json
new file mode 100644
index 0000000000000000000000000000000000000000..e4cb52af8f8a7bd08ddcc03281f28468f402c6b3
--- /dev/null
+++ b/model_data_json/tiiuae_falcon-7b.json
@@ -0,0 +1,26 @@
+{
+ "model_id": "tiiuae/falcon-7b",
+ "downloads": 70655,
+ "tags": [
+ "transformers",
+ "pytorch",
+ "safetensors",
+ "falcon",
+ "text-generation",
+ "custom_code",
+ "en",
+ "dataset:tiiuae/falcon-refinedweb",
+ "arxiv:2205.14135",
+ "arxiv:1911.02150",
+ "arxiv:2101.00027",
+ "arxiv:2005.14165",
+ "arxiv:2104.09864",
+ "arxiv:2306.01116",
+ "license:apache-2.0",
+ "autotrain_compatible",
+ "text-generation-inference",
+ "region:us"
+ ],
+ "description": "--- datasets: - tiiuae/falcon-refinedweb language: - en inference: false license: apache-2.0 new_version: tiiuae/falcon-11B --- # 🚀 Falcon-7B **Falcon-7B is a 7B parameters causal decoder-only model built by TII and trained on 1,500B tokens of RefinedWeb enhanced with curated corpora. It is made available under the Apache 2.0 license.** *Paper coming soon* 😊. 🤗 To get started with Falcon (inference, finetuning, quantization, etc.), we recommend reading this great blogpost fron HF! ## Why use Falcon-7B? * **It outperforms comparable open-source models** (e.g., MPT-7B, StableLM, RedPajama etc.), thanks to being trained on 1,500B tokens of RefinedWeb enhanced with curated corpora. See the OpenLLM Leaderboard. * **It features an architecture optimized for inference**, with FlashAttention (Dao et al., 2022) and multiquery (Shazeer et al., 2019). * **It is made available under a permissive Apache 2.0 license allowing for commercial use**, without any royalties or restrictions. ⚠️ **This is a raw, pretrained model, which should be further finetuned for most usecases.** If you are looking for a version better suited to taking generic instructions in a chat format, we recommend taking a look at Falcon-7B-Instruct. 🔥 **Looking for an even more powerful model?** Falcon-40B is Falcon-7B's big brother! 💥 **Falcon LLMs require PyTorch 2.0 for use with !** For fast inference with Falcon, check-out Text Generation Inference! Read more in this blogpost. You will need **at least 16GB of memory** to swiftly run inference with Falcon-7B. # Model Card for Falcon-7B ## Model Details ### Model Description - **Developed by:** - **Model type:** Causal decoder-only; - **Language(s) (NLP):** English, German, Spanish, French (and limited capabilities in Italian, Portuguese, Polish, Dutch, Romanian, Czech, Swedish); - **License:** Apache 2.0. ### Model Source - **Paper:** *coming soon*. ## Uses ### Direct Use Research on large language models; as a foundation for further specialization and finetuning for specific usecases (e.g., summarization, text generation, chatbot, etc.) ### Out-of-Scope Use Production use without adequate assessment of risks and mitigation; any use cases which may be considered irresponsible or harmful. ## Bias, Risks, and Limitations Falcon-7B is trained on English and French data only, and will not generalize appropriately to other languages. Furthermore, as it is trained on a large-scale corpora representative of the web, it will carry the stereotypes and biases commonly encountered online. ### Recommendations We recommend users of Falcon-7B to consider finetuning it for the specific set of tasks of interest, and for guardrails and appropriate precautions to be taken for any production use. ## How to Get Started with the Model ## Training Details ### Training Data Falcon-7B was trained on 1,500B tokens of RefinedWeb, a high-quality filtered and deduplicated web dataset which we enhanced with curated corpora. Significant components from our curated copora were inspired by The Pile (Gao et al., 2020). | **Data source** | **Fraction** | **Tokens** | **Sources** | |--------------------|--------------|------------|-----------------------------------| | RefinedWeb-English | 79% | 1,185B | massive web crawl | | Books | 7% | 110B | | | Conversations | 6% | 85B | Reddit, StackOverflow, HackerNews | | Code | 3% | 45B | | | RefinedWeb-French | 3% | 45B | massive web crawl | | Technical | 2% | 30B | arXiv, PubMed, USPTO, etc. | The data was tokenized with the Falcon-7B/40B tokenizer. ### Training Procedure Falcon-7B was trained on 384 A100 40GB GPUs, using a 2D parallelism strategy (PP=2, DP=192) combined with ZeRO. #### Training Hyperparameters | **Hyperparameter** | **Value** | **Comment** | |--------------------|------------|-------------------------------------------| | Precision | | | | Optimizer | AdamW | | | Learning rate | 6e-4 | 4B tokens warm-up, cosine decay to 1.2e-5 | | Weight decay | 1e-1 | | | Z-loss | 1e-4 | | | Batch size | 2304 | 30B tokens ramp-up | #### Speeds, Sizes, Times Training happened in early March 2023 and took about two weeks. ## Evaluation *Paper coming soon*. See the OpenLLM Leaderboard for early results. ## Technical Specifications ### Model Architecture and Objective Falcon-7B is a causal decoder-only model trained on a causal language modeling task (i.e., predict the next token). The architecture is broadly adapted from the GPT-3 paper (Brown et al., 2020), with the following differences: * **Positionnal embeddings:** rotary (Su et al., 2021); * **Attention:** multiquery (Shazeer et al., 2019) and FlashAttention (Dao et al., 2022); * **Decoder-block:** parallel attention/MLP with a single layer norm. | **Hyperparameter** | **Value** | **Comment** | |--------------------|-----------|----------------------------------------| | Layers | 32 | | | | 4544 | Increased to compensate for multiquery | | | 64 | Reduced to optimise for FlashAttention | | Vocabulary | 65024 | | | Sequence length | 2048 | | ### Compute Infrastructure #### Hardware Falcon-7B was trained on AWS SageMaker, on 384 A100 40GB GPUs in P4d instances. #### Software Falcon-7B was trained a custom distributed training codebase, Gigatron. It uses a 3D parallelism approach combined with ZeRO and high-performance Triton kernels (FlashAttention, etc.) ## Citation *Paper coming soon* 😊. In the meanwhile, you can use the following information to cite: To learn more about the pretraining dataset, see the 📓 RefinedWeb paper. ## License Falcon-7B is made available under the Apache 2.0 license. ## Contact falconllm@tii.ae",
+ "model_explanation_gemini": "Falcon-7B is a 7-billion-parameter, causal decoder-only model trained on 1,500B tokens of refined web and curated data, optimized for inference and outperforming similar open-source models like MPT-7B and StableLM under a permissive Apache 2.0 license. \n\n**Features:** \n- **Architecture:** Causal decoder-only with FlashAttention and multiquery optimization. \n- **Training Data:** 1,500B tokens from"
+}
\ No newline at end of file
diff --git a/model_data_json/tiiuae_falcon-rw-1b.json b/model_data_json/tiiuae_falcon-rw-1b.json
new file mode 100644
index 0000000000000000000000000000000000000000..fd9181788611bf0aab120373c28e37bbe08eb341
--- /dev/null
+++ b/model_data_json/tiiuae_falcon-rw-1b.json
@@ -0,0 +1,23 @@
+{
+ "model_id": "tiiuae/falcon-rw-1b",
+ "downloads": 72026,
+ "tags": [
+ "transformers",
+ "pytorch",
+ "falcon",
+ "text-generation",
+ "custom_code",
+ "en",
+ "dataset:tiiuae/falcon-refinedweb",
+ "arxiv:2306.01116",
+ "arxiv:2005.14165",
+ "arxiv:2108.12409",
+ "arxiv:2205.14135",
+ "license:apache-2.0",
+ "autotrain_compatible",
+ "text-generation-inference",
+ "region:us"
+ ],
+ "description": "--- datasets: - tiiuae/falcon-refinedweb language: - en inference: false license: apache-2.0 --- # Falcon-RW-1B **Falcon-RW-1B is a 1B parameters causal decoder-only model built by TII and trained on 350B tokens of RefinedWeb. It is made available under the Apache 2.0 license.** See the 📓 paper on arXiv for more details. RefinedWeb is a high-quality web dataset built by leveraging stringent filtering and large-scale deduplication. Falcon-RW-1B, trained on RefinedWeb only, matches or outperforms comparable models trained on curated data. ⚠️ Falcon is now available as a core model in the library! To use the in-library version, please install the latest version of with , then simply remove the argument from . ⚠️ This model is intended for use as a **research artifact**, to study the influence of training on web data alone. **If you are interested in state-of-the-art models, we recommend using Falcon-7B/40B, both trained on >1,000 billion tokens.** 💥 **Falcon LLMs require PyTorch 2.0 for use with !** # Model Card for Falcon-RW-1B ## Model Details ### Model Description - **Developed by:** - **Model type:** Causal decoder-only; - **Language(s) (NLP):** English; - **License:** Apache 2.0. ### Model Source - **Paper:** ## Uses ### Direct Use Research on large language models, specifically the influence of adequately filtered and deduplicated web data on the properties of large language models (fairness, safety, limitations, capabilities, etc.). ### Out-of-Scope Use Production use without adequate assessment of risks and mitigation; any use cases which may be considered irresponsible or harmful. Broadly speaking, we would recommend Falcon-7B/40B for any use not directly related to research on web data pipelines. ## Bias, Risks, and Limitations Falcon-RW-1B is trained on English data only, and will not generalize appropriately to other languages. Furthermore, as it is trained on a large-scale corpora representative of the web, it will carry the stereotypes and biases commonly encountered online. ### Recommendations We recommend users of Falcon-RW-1B to consider finetuning it for the specific set of tasks of interest, and for guardrails and appropriate precautions to be taken for any production use. ## How to Get Started with the Model ## Training Details ### Training Data Falcon-RW-1B was trained on 350B tokens of RefinedWeb, a high-quality filtered and deduplicated web dataset. The data was tokenized with the GPT-2 tokenizer. ### Training Procedure Falcon-RW-1B was trained on 32 A100 40GB GPUs, using only data parallelism with ZeRO. #### Training Hyperparameters Hyperparameters were adapted from the GPT-3 paper (Brown et al., 2020). | **Hyperparameter** | **Value** | **Comment** | |--------------------|------------|-------------------------------------------| | Precision | | | | Optimizer | AdamW | | | Learning rate | 2e-4 | 500M tokens warm-up, cosine decay to 2e-5 | | Weight decay | 1e-1 | | | Batch size | 512 | 4B tokens ramp-up | #### Speeds, Sizes, Times Training happened in early December 2022 and took about six days. ## Evaluation See the 📓 paper on arXiv for in-depth evaluation. ## Technical Specifications ### Model Architecture and Objective Falcon-RW-1B is a causal decoder-only model trained on a causal language modeling task (i.e., predict the next token). The architecture is adapted from the GPT-3 paper (Brown et al., 2020), but uses ALiBi (Ofir et al., 2021) and FlashAttention (Dao et al., 2022). | **Hyperparameter** | **Value** | **Comment** | |--------------------|-----------|----------------------------------------| | Layers | 24 | | | | 2048 | | | | 64 | Reduced to optimise for FlashAttention | | Vocabulary | 50304 | | | Sequence length | 2048 | | ### Compute Infrastructure #### Hardware Falcon-RW-1B was trained on AWS SageMaker, on 32 A100 40GB GPUs in P4d instances. #### Software Falcon-RW-1B was trained a custom distributed training codebase, Gigatron. It uses a 3D parallelism approach combined with ZeRO and high-performance Triton kernels (FlashAttention, etc.) ## Citation ## Contact falconllm@tii.ae",
+ "model_explanation_gemini": "\"Falcon-RW-1B is a 1-billion-parameter English causal decoder-only model trained on 350B tokens of RefinedWeb for researching the impact of filtered web data on large language models, outperforming comparable models trained on curated data.\"\n\n**Model Features:** \n- **Architecture:** Causal decoder-only (GPT-3 style) with ALiBi and FlashAttention \n- **Training Data:** 350B tokens from RefinedWeb (filtered, ded"
+}
\ No newline at end of file
diff --git a/model_data_json/timm_ViT-SO400M-14-SigLIP.json b/model_data_json/timm_ViT-SO400M-14-SigLIP.json
new file mode 100644
index 0000000000000000000000000000000000000000..eda4b33462c6e17c4e94601c7bfadb8b334fac8e
--- /dev/null
+++ b/model_data_json/timm_ViT-SO400M-14-SigLIP.json
@@ -0,0 +1,17 @@
+{
+ "model_id": "timm/ViT-SO400M-14-SigLIP",
+ "downloads": 72007,
+ "tags": [
+ "open_clip",
+ "safetensors",
+ "clip",
+ "siglip",
+ "zero-shot-image-classification",
+ "dataset:webli",
+ "arxiv:2303.15343",
+ "license:apache-2.0",
+ "region:us"
+ ],
+ "description": "--- tags: - clip - siglip library_name: open_clip pipeline_tag: zero-shot-image-classification license: apache-2.0 datasets: - webli --- # Model card for ViT-SO400M-14-SigLIP A SigLIP (Sigmoid loss for Language-Image Pre-training) model trained on WebLI. This model has been converted to PyTorch from the original JAX checkpoints in Big Vision. These weights are usable in both OpenCLIP (image + text) and timm (image only). ## Model Details - **Model Type:** Contrastive Image-Text, Zero-Shot Image Classification. - **Original:** - **Dataset:** WebLI - **Papers:** - Sigmoid loss for language image pre-training: ## Model Usage ### With OpenCLIP ### With (for image embeddings) ## Citation",
+ "model_explanation_gemini": "A SigLIP model trained on WebLI for zero-shot image classification and contrastive image-text tasks, usable in OpenCLIP (image + text) and timm (image only). \n\n**Features:** \n- Model type: Contrastive Image-Text, Zero-Shot Image Classification \n- Dataset: WebLI \n- Frameworks: OpenCLIP (image + text), timm (image only) \n- Original implementation: Converted from JAX (Big Vision) to PyTor"
+}
\ No newline at end of file
diff --git a/model_data_json/timm_convnext_tiny.in12k.json b/model_data_json/timm_convnext_tiny.in12k.json
new file mode 100644
index 0000000000000000000000000000000000000000..c17e972cc602d5d41983c3083d90e3d9a963517b
--- /dev/null
+++ b/model_data_json/timm_convnext_tiny.in12k.json
@@ -0,0 +1,17 @@
+{
+ "model_id": "timm/convnext_tiny.in12k",
+ "downloads": 69867,
+ "tags": [
+ "timm",
+ "pytorch",
+ "safetensors",
+ "image-classification",
+ "transformers",
+ "dataset:imagenet-12k",
+ "arxiv:2201.03545",
+ "license:apache-2.0",
+ "region:us"
+ ],
+ "description": "--- license: apache-2.0 library_name: timm tags: - image-classification - timm - transformers datasets: - imagenet-12k --- # Model card for convnext_tiny.in12k A ConvNeXt image classification model. Trained in on ImageNet-12k (a 11821 class subset of full ImageNet-22k) by Ross Wightman. ImageNet-12k training done on TPUs thanks to support of the TRC program. ## Model Details - **Model Type:** Image classification / feature backbone - **Model Stats:** - Params (M): 36.9 - GMACs: 4.5 - Activations (M): 13.4 - Image size: 224 x 224 - **Papers:** - A ConvNet for the 2020s: - **Original:** - **Dataset:** ImageNet-12k ## Model Usage ### Image Classification ### Feature Map Extraction ### Image Embeddings ## Model Comparison Explore the dataset and runtime metrics of this model in timm model results. All timing numbers from eager model PyTorch 1.13 on RTX 3090 w/ AMP. | model |top1 |top5 |img_size|param_count|gmacs |macts |samples_per_sec|batch_size| |------------------------------------------------------------------------------------------------------------------------------|------|------|--------|-----------|------|------|---------------|----------| | convnextv2_huge.fcmae_ft_in22k_in1k_512 |88.848|98.742|512 |660.29 |600.81|413.07|28.58 |48 | | convnextv2_huge.fcmae_ft_in22k_in1k_384 |88.668|98.738|384 |660.29 |337.96|232.35|50.56 |64 | | convnext_xxlarge.clip_laion2b_soup_ft_in1k |88.612|98.704|256 |846.47 |198.09|124.45|122.45 |256 | | convnext_large_mlp.clip_laion2b_soup_ft_in12k_in1k_384 |88.312|98.578|384 |200.13 |101.11|126.74|196.84 |256 | | convnextv2_large.fcmae_ft_in22k_in1k_384 |88.196|98.532|384 |197.96 |101.1 |126.74|128.94 |128 | | convnext_large_mlp.clip_laion2b_soup_ft_in12k_in1k_320 |87.968|98.47 |320 |200.13 |70.21 |88.02 |283.42 |256 | | convnext_xlarge.fb_in22k_ft_in1k_384 |87.75 |98.556|384 |350.2 |179.2 |168.99|124.85 |192 | | convnextv2_base.fcmae_ft_in22k_in1k_384 |87.646|98.422|384 |88.72 |45.21 |84.49 |209.51 |256 | | convnext_large.fb_in22k_ft_in1k_384 |87.476|98.382|384 |197.77 |101.1 |126.74|194.66 |256 | | convnext_large_mlp.clip_laion2b_augreg_ft_in1k |87.344|98.218|256 |200.13 |44.94 |56.33 |438.08 |256 | | convnextv2_large.fcmae_ft_in22k_in1k |87.26 |98.248|224 |197.96 |34.4 |43.13 |376.84 |256 | | convnext_base.clip_laion2b_augreg_ft_in12k_in1k_384 |87.138|98.212|384 |88.59 |45.21 |84.49 |365.47 |256 | | convnext_xlarge.fb_in22k_ft_in1k |87.002|98.208|224 |350.2 |60.98 |57.5 |368.01 |256 | | convnext_base.fb_in22k_ft_in1k_384 |86.796|98.264|384 |88.59 |45.21 |84.49 |366.54 |256 | | convnextv2_base.fcmae_ft_in22k_in1k |86.74 |98.022|224 |88.72 |15.38 |28.75 |624.23 |256 | | convnext_large.fb_in22k_ft_in1k |86.636|98.028|224 |197.77 |34.4 |43.13 |581.43 |256 | | convnext_base.clip_laiona_augreg_ft_in1k_384 |86.504|97.97 |384 |88.59 |45.21 |84.49 |368.14 |256 | | convnext_base.clip_laion2b_augreg_ft_in12k_in1k |86.344|97.97 |256 |88.59 |20.09 |37.55 |816.14 |256 | | convnextv2_huge.fcmae_ft_in1k |86.256|97.75 |224 |660.29 |115.0 |79.07 |154.72 |256 | | convnext_small.in12k_ft_in1k_384 |86.182|97.92 |384 |50.22 |25.58 |63.37 |516.19 |256 | | convnext_base.clip_laion2b_augreg_ft_in1k |86.154|97.68 |256 |88.59 |20.09 |37.55 |819.86 |256 | | convnext_base.fb_in22k_ft_in1k |85.822|97.866|224 |88.59 |15.38 |28.75 |1037.66 |256 | | convnext_small.fb_in22k_ft_in1k_384 |85.778|97.886|384 |50.22 |25.58 |63.37 |518.95 |256 | | convnextv2_large.fcmae_ft_in1k |85.742|97.584|224 |197.96 |34.4 |43.13 |375.23 |256 | | convnext_small.in12k_ft_in1k |85.174|97.506|224 |50.22 |8.71 |21.56 |1474.31 |256 | | convnext_tiny.in12k_ft_in1k_384 |85.118|97.608|384 |28.59 |13.14 |39.48 |856.76 |256 | | convnextv2_tiny.fcmae_ft_in22k_in1k_384 |85.112|97.63 |384 |28.64 |13.14 |39.48 |491.32 |256 | | convnextv2_base.fcmae_ft_in1k |84.874|97.09 |224 |88.72 |15.38 |28.75 |625.33 |256 | | convnext_small.fb_in22k_ft_in1k |84.562|97.394|224 |50.22 |8.71 |21.56 |1478.29 |256 | | convnext_large.fb_in1k |84.282|96.892|224 |197.77 |34.4 |43.13 |584.28 |256 | | convnext_tiny.in12k_ft_in1k |84.186|97.124|224 |28.59 |4.47 |13.44 |2433.7 |256 | | convnext_tiny.fb_in22k_ft_in1k_384 |84.084|97.14 |384 |28.59 |13.14 |39.48 |862.95 |256 | | convnextv2_tiny.fcmae_ft_in22k_in1k |83.894|96.964|224 |28.64 |4.47 |13.44 |1452.72 |256 | | convnext_base.fb_in1k |83.82 |96.746|224 |88.59 |15.38 |28.75 |1054.0 |256 | | convnextv2_nano.fcmae_ft_in22k_in1k_384 |83.37 |96.742|384 |15.62 |7.22 |24.61 |801.72 |256 | | convnext_small.fb_in1k |83.142|96.434|224 |50.22 |8.71 |21.56 |1464.0 |256 | | convnextv2_tiny.fcmae_ft_in1k |82.92 |96.284|224 |28.64 |4.47 |13.44 |1425.62 |256 | | convnext_tiny.fb_in22k_ft_in1k |82.898|96.616|224 |28.59 |4.47 |13.44 |2480.88 |256 | | convnext_nano.in12k_ft_in1k |82.282|96.344|224 |15.59 |2.46 |8.37 |3926.52 |256 | | convnext_tiny_hnf.a2h_in1k |82.216|95.852|224 |28.59 |4.47 |13.44 |2529.75 |256 | | convnext_tiny.fb_in1k |82.066|95.854|224 |28.59 |4.47 |13.44 |2346.26 |256 | | convnextv2_nano.fcmae_ft_in22k_in1k |82.03 |96.166|224 |15.62 |2.46 |8.37 |2300.18 |256 | | convnextv2_nano.fcmae_ft_in1k |81.83 |95.738|224 |15.62 |2.46 |8.37 |2321.48 |256 | | convnext_nano_ols.d1h_in1k |80.866|95.246|224 |15.65 |2.65 |9.38 |3523.85 |256 | | convnext_nano.d1h_in1k |80.768|95.334|224 |15.59 |2.46 |8.37 |3915.58 |256 | | convnextv2_pico.fcmae_ft_in1k |80.304|95.072|224 |9.07 |1.37 |6.1 |3274.57 |256 | | convnext_pico.d1_in1k |79.526|94.558|224 |9.05 |1.37 |6.1 |5686.88 |256 | | convnext_pico_ols.d1_in1k |79.522|94.692|224 |9.06 |1.43 |6.5 |5422.46 |256 | | convnextv2_femto.fcmae_ft_in1k |78.488|93.98 |224 |5.23 |0.79 |4.57 |4264.2 |256 | | convnext_femto_ols.d1_in1k |77.86 |93.83 |224 |5.23 |0.82 |4.87 |6910.6 |256 | | convnext_femto.d1_in1k |77.454|93.68 |224 |5.22 |0.79 |4.57 |7189.92 |256 | | convnextv2_atto.fcmae_ft_in1k |76.664|93.044|224 |3.71 |0.55 |3.81 |4728.91 |256 | | convnext_atto_ols.a2_in1k |75.88 |92.846|224 |3.7 |0.58 |4.11 |7963.16 |256 | | convnext_atto.d2_in1k |75.664|92.9 |224 |3.7 |0.55 |3.81 |8439.22 |256 | ## Citation",
+ "model_explanation_gemini": "A ConvNeXt-based image classification model trained on ImageNet-12k for categorizing images into 11,821 classes or extracting features/embeddings. \n\n**Features:** \n- **Model Type:** Image classification / feature backbone \n- **Params:** 36.9M \n- **Input Size:** 224x224 pixels \n- **Training Dataset:** ImageNet-12k (11,821 classes) \n- **Performance Metrics:** 85.118% top-1"
+}
\ No newline at end of file
diff --git a/model_data_json/timm_tf_efficientnetv2_s.in21k_ft_in1k.json b/model_data_json/timm_tf_efficientnetv2_s.in21k_ft_in1k.json
new file mode 100644
index 0000000000000000000000000000000000000000..d6288088b4768c9bb70b0f3e163e2f8911df8896
--- /dev/null
+++ b/model_data_json/timm_tf_efficientnetv2_s.in21k_ft_in1k.json
@@ -0,0 +1,18 @@
+{
+ "model_id": "timm/tf_efficientnetv2_s.in21k_ft_in1k",
+ "downloads": 76011,
+ "tags": [
+ "timm",
+ "pytorch",
+ "safetensors",
+ "image-classification",
+ "transformers",
+ "dataset:imagenet-1k",
+ "dataset:imagenet-21k",
+ "arxiv:2104.00298",
+ "license:apache-2.0",
+ "region:us"
+ ],
+ "description": "--- tags: - image-classification - timm - transformers library_name: timm license: apache-2.0 datasets: - imagenet-1k - imagenet-21k --- # Model card for tf_efficientnetv2_s.in21k_ft_in1k A EfficientNet-v2 image classification model. Trained on ImageNet-21k and fine-tuned on ImageNet-1k in Tensorflow by paper authors, ported to PyTorch by Ross Wightman. ## Model Details - **Model Type:** Image classification / feature backbone - **Model Stats:** - Params (M): 21.5 - GMACs: 5.4 - Activations (M): 22.7 - Image size: train = 300 x 300, test = 384 x 384 - **Papers:** - EfficientNetV2: Smaller Models and Faster Training: - **Dataset:** ImageNet-1k - **Pretrain Dataset:** ImageNet-21k - **Original:** ## Model Usage ### Image Classification ### Feature Map Extraction ### Image Embeddings ## Model Comparison Explore the dataset and runtime metrics of this model in timm model results. ## Citation",
+ "model_explanation_gemini": "\"An EfficientNet-v2 image classification model pretrained on ImageNet-21k and fine-tuned on ImageNet-1k, capable of feature extraction and embeddings.\"\n\nModel Features: \n- **Type**: Image classification / feature backbone \n- **Params**: 21.5M \n- **Input size**: Train (300x300), Test (384x384) \n- **Pretrained on**: ImageNet-21k \n- **Fine-tuned on**: ImageNet-1"
+}
\ No newline at end of file
diff --git a/model_data_json/timm_vit_small_patch16_224.dino.json b/model_data_json/timm_vit_small_patch16_224.dino.json
new file mode 100644
index 0000000000000000000000000000000000000000..6ffb8bf727974836028dad905b95b034f8320eb9
--- /dev/null
+++ b/model_data_json/timm_vit_small_patch16_224.dino.json
@@ -0,0 +1,17 @@
+{
+ "model_id": "timm/vit_small_patch16_224.dino",
+ "downloads": 72865,
+ "tags": [
+ "timm",
+ "pytorch",
+ "safetensors",
+ "image-feature-extraction",
+ "transformers",
+ "arxiv:2104.14294",
+ "arxiv:2010.11929",
+ "license:apache-2.0",
+ "region:us"
+ ],
+ "description": "--- license: apache-2.0 library_name: timm tags: - image-feature-extraction - timm - transformers --- # Model card for vit_small_patch16_224.dino A Vision Transformer (ViT) image feature model. Trained with Self-Supervised DINO method. ## Model Details - **Model Type:** Image classification / feature backbone - **Model Stats:** - Params (M): 21.7 - GMACs: 4.3 - Activations (M): 8.2 - Image size: 224 x 224 - **Papers:** - Emerging Properties in Self-Supervised Vision Transformers: - An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale: - **Pretrain Dataset:** ImageNet-1k - **Original:** ## Model Usage ### Image Classification ### Image Embeddings ## Model Comparison Explore the dataset and runtime metrics of this model in timm model results. ## Citation",
+ "model_explanation_gemini": "A Vision Transformer (ViT) model trained with self-supervised DINO for image classification and feature extraction, using 224x224 resolution inputs.\n\nModel Features: \n- **Model Type**: Vision Transformer (ViT-small, patch16) \n- **Training Method**: Self-supervised DINO \n- **Primary Tasks**: Image classification, feature embedding extraction \n- **Input Resolution**: 224x224 pixels \n- **Model Size**: 21.7M parameters, 4.3"
+}
\ No newline at end of file
diff --git a/model_data_json/togethercomputer_evo-1-8k-base.json b/model_data_json/togethercomputer_evo-1-8k-base.json
new file mode 100644
index 0000000000000000000000000000000000000000..fbc620a561907fe7c2164cbab4ead6498bd1d32a
--- /dev/null
+++ b/model_data_json/togethercomputer_evo-1-8k-base.json
@@ -0,0 +1,28 @@
+{
+ "model_id": "togethercomputer/evo-1-8k-base",
+ "downloads": 73720,
+ "tags": [
+ "transformers",
+ "safetensors",
+ "stripedhyena",
+ "text-generation",
+ "long context",
+ "deep signal processing",
+ "hybrid",
+ "biology",
+ "genomics",
+ "custom_code",
+ "arxiv:2302.10866",
+ "arxiv:2203.14343",
+ "arxiv:2310.18780",
+ "arxiv:2206.11893",
+ "arxiv:2303.06349",
+ "arxiv:2102.02611",
+ "arxiv:2210.09298",
+ "license:apache-2.0",
+ "autotrain_compatible",
+ "region:us"
+ ],
+ "description": "--- license: apache-2.0 tags: - stripedhyena - long context - deep signal processing - hybrid - biology - genomics --- ## Evo-1 (Phase 1)
### News We identified and fixed an issue related to a wrong permutation of some projections, which affects generation quality. To use the new model revision, please load as follows: ### About Evo is a biological foundation model capable of long-context modeling and design. Evo uses the StripedHyena architecture to enable modeling of sequences at a single-nucleotide, byte-level resolution with near-linear scaling of compute and memory relative to context length. Evo has 7 billion parameters and is trained on OpenGenome, a prokaryotic whole-genome dataset containing ~300 billion tokens. Technical details about Evo can be found in our preprint and our accompanying blog posts. Evo was collaboratively developed by the Arc Institute and TogetherAI. As part of our commitment to open science, we release **weights of 15 intermediate pretraining checkpoints** for phase 1 and phase 2 of pretraining. The checkpoints are available as branches of the corresponding HuggingFace repository. **Evo-1 (Phase 1)** is our first model in the Evo family, trained at a context length of 8k. | Checkpoint Name | Description | |----------------------------------------|-------------| | | A model pretrained with 8,192 context. We use this model as the base model for molecular-scale finetuning tasks. | | | A model pretrained with 131,072 context using as the initialization. We use this model to reason about and generate sequences at the genome scale. | ### Model Architecture StripedHyena is a deep signal processing, hybrid architecture composed of multi-head attention and gated convolutions arranged in Hyena blocks, improving over decoder-only Transformers. StripedHyena is designed to leverage the specialization of each of its layer classes, with Hyena layers implementing the bulk of the computation required for sequence processing and attention layers supplementing the ability to perform targeted pattern recall. Some highlights of the architecture: - **Efficient autoregressive generation** via a recurrent mode (>500k generation with a single 80GB GPU) - **Significantly faster training and finetuning** at long context (>3x at 131k) - **Improved scaling laws over state-of-the-art architectures** (e.g., Transformer++) on both natural language and biological sequences. - **Robust to training beyond the compute-optimal frontier** e.g., training way beyond Chinchilla-optimal token amounts (see preprint for details -- more details to come) ### How to use Evo Example usage is provided in the standalone repo. #### Parametrization for Inference and Finetuning One of the advantages of deep signal processing models is their flexibility. Different parametrizations of convolutions can be used depending on the memory, expressivity and causality requirements of pretraining, finetuning or inference workloads. The main classes are: - Modal canonical: unconstrained poles (reference, reference), or constrained poles (reference, reference). - Companion canonical / rational: TBA. - Hypernetworks: hypernetwork (reference), modulated hypernetwork (reference). - Explicit: modulated explicit (reference). StripedHyena is a mixed precision model. Make sure to keep your and in precision, especially for longer prompts or training. ### Disclaimer To use StripedHyena outside of the playground, you will need to install custom kernels. Please follow the instructions from the standalone repository. ## Cite",
+ "model_explanation_gemini": "\"Evo-1-8k-base is a 7B-parameter biological foundation model for long-context genomic sequence modeling and design, using the StripedHyena hybrid architecture for efficient byte-level processing.\"\n\n### Model Features: \n- **Architecture**: StripedHyena (Hyena blocks + multi-head attention) \n- **Purpose**: Single-nucleotide, byte-level genome sequence modeling \n- **Context Length**: 8k tokens (base version) \n- **Training Data**:"
+}
\ No newline at end of file
diff --git a/model_data_json/typeform_distilbert-base-uncased-mnli.json b/model_data_json/typeform_distilbert-base-uncased-mnli.json
new file mode 100644
index 0000000000000000000000000000000000000000..5c314758fed8d1dc6b5f0e75c4ed4bda43796eb7
--- /dev/null
+++ b/model_data_json/typeform_distilbert-base-uncased-mnli.json
@@ -0,0 +1,22 @@
+{
+ "model_id": "typeform/distilbert-base-uncased-mnli",
+ "downloads": 70769,
+ "tags": [
+ "transformers",
+ "pytorch",
+ "tf",
+ "safetensors",
+ "distilbert",
+ "text-classification",
+ "zero-shot-classification",
+ "en",
+ "dataset:multi_nli",
+ "arxiv:1910.09700",
+ "arxiv:2105.09680",
+ "autotrain_compatible",
+ "endpoints_compatible",
+ "region:us"
+ ],
+ "description": "--- language: en pipeline_tag: zero-shot-classification tags: - distilbert datasets: - multi_nli metrics: - accuracy --- # DistilBERT base model (uncased) ## Table of Contents - Model Details - How to Get Started With the Model - Uses - Risks, Limitations and Biases - Training - Evaluation - Environmental Impact ## Model Details **Model Description:** This is the uncased DistilBERT model fine-tuned on Multi-Genre Natural Language Inference (MNLI) dataset for the zero-shot classification task. - **Developed by:** The Typeform team. - **Model Type:** Zero-Shot Classification - **Language(s):** English - **License:** Unknown - **Parent Model:** See the distilbert base uncased model for more information about the Distilled-BERT base model. ## How to Get Started with the Model ## Uses This model can be used for text classification tasks. ## Risks, Limitations and Biases **CONTENT WARNING: Readers should be aware this section contains content that is disturbing, offensive, and can propagate historical and current stereotypes.** Significant research has explored bias and fairness issues with language models (see, e.g., Sheng et al. (2021) and Bender et al. (2021)). ## Training #### Training Data This model of DistilBERT-uncased is pretrained on the Multi-Genre Natural Language Inference (MultiNLI) corpus. It is a crowd-sourced collection of 433k sentence pairs annotated with textual entailment information. The corpus covers a range of genres of spoken and written text, and supports a distinctive cross-genre generalization evaluation. This model is also **not** case-sensitive, i.e., it does not make a difference between \"english\" and \"English\". #### Training Procedure Training is done on a p3.2xlarge AWS EC2 with the following hyperparameters: ## Evaluation #### Evaluation Results When fine-tuned on downstream tasks, this model achieves the following results: - **Epoch = ** 5.0 - **Evaluation Accuracy =** 0.8206875508543532 - **Evaluation Loss =** 0.8706700205802917 - ** Evaluation Runtime = ** 17.8278 - ** Evaluation Samples per second = ** 551.498 MNLI and MNLI-mm results: | Task | MNLI | MNLI-mm | |:----:|:----:|:----:| | | 82.0 | 82.0 | ## Environmental Impact Carbon emissions can be estimated using the Machine Learning Impact calculator presented in Lacoste et al. (2019). We present the hardware type based on the associated paper. **Hardware Type:** 1 NVIDIA Tesla V100 GPUs **Hours used:** Unknown **Cloud Provider:** AWS EC2 P3 **Compute Region:** Unknown **Carbon Emitted:** (Power consumption x Time x Carbon produced based on location of power grid): Unknown",
+ "model_explanation_gemini": "A zero-shot text classification model based on DistilBERT, fine-tuned on the MNLI dataset for English tasks without requiring task-specific training data.\n\n**Features:** \n- Zero-shot classification \n- English language \n- Uncased (case-insensitive) \n- Fine-tuned on MultiNLI (433k sentence pairs) \n- DistilBERT base architecture (efficient, distilled version of BERT) \n\n**Comparison:** \nUnlike standard classification models requiring labeled training data for each task,"
+}
\ No newline at end of file
diff --git a/model_data_json/unsloth_DeepSeek-V3-0324-GGUF.json b/model_data_json/unsloth_DeepSeek-V3-0324-GGUF.json
new file mode 100644
index 0000000000000000000000000000000000000000..07f95853a1856fbc0eb5a96c8841f81e1d656f22
--- /dev/null
+++ b/model_data_json/unsloth_DeepSeek-V3-0324-GGUF.json
@@ -0,0 +1,25 @@
+{
+ "model_id": "unsloth/DeepSeek-V3-0324-GGUF",
+ "downloads": 78841,
+ "tags": [
+ "transformers",
+ "gguf",
+ "deepseek_v3",
+ "text-generation",
+ "deepseek",
+ "unsloth",
+ "custom_code",
+ "en",
+ "arxiv:2412.19437",
+ "base_model:deepseek-ai/DeepSeek-V3-0324",
+ "base_model:quantized:deepseek-ai/DeepSeek-V3-0324",
+ "license:mit",
+ "autotrain_compatible",
+ "endpoints_compatible",
+ "fp8",
+ "region:us",
+ "conversational"
+ ],
+ "description": "--- base_model: deepseek-ai/DeepSeek-V3-0324 language: - en library_name: transformers license: mit tags: - deepseek_v3 - deepseek - unsloth - transformers ---
Our DeepSeek-V3-0324 GGUFs allow you to run the model in llama.cpp, LMStudio, Open WebUI and other inference frameworks. Includes 1-4-bit Dynamic versions, which yields better accuracy and results than standard quantization. | MoE Bits | Type | Disk Size | Accuracy | Link | Details | |----------|----------|-------------|----------|------------------------------------------------------------------------------------------------------------|---------------------------------------------------| | 1.78bit (prelim) | IQ1_S | **186GB** | Ok | Link | in MoE mixture of 2.06/1.78bit | | 1.93bit (prelim) | IQ1_M | **196GB** | Fair | Link | in MoE mixture of 2.06/1.93bit | | 2.42bit | IQ2_XXS | **219GB** | Recommended | Link | in MoE all 2.42bit | | 2.71bit | Q2_K_XL | **248GB** | Recommended | Link | in MoE mixture of 3.5/2.71bit | | 3.5bit | Q3_K_XL | **321GB** | Great | Link | in MoE mixture of 4.5/3.5bit | | 4.5bit | Q4_K_XL | **405GB** | Best | Link | in MoE mixture of 5.5/4.5bit | Prelim = preliminary - through our testing, they're generally fine but sometimes don't produce the best code and so more work/testing needs to be done. 2.71bit was found to be the best in terms of performance/size and produces code that is great and works well. 2.42bit was also found to pass all our tests. So, for best results, use the 2.42-bit (IQ2_XXS) or 2.71-bit (Q2_K_XL) versions. Though not a must, try to have at least 180GB+ combined VRAM + RAM. Thank you to the DeepSeek team for releasing their March update to the DeepSeek V3 models. Also thank you to bartowski for providing imatric V3 quants. # Finetune your own Reasoning model like R1 with Unsloth! We have a free Google Colab notebook for turning Llama 3.1 (8B) into a reasoning model: ## ✨ Finetune for Free All notebooks are **beginner friendly**! Add your dataset, click \"Run All\", and you'll get a 2x faster finetuned model which can be exported to GGUF, vLLM or uploaded to Hugging Face. | Unsloth supports | Free Notebooks | Performance | Memory use | |-----------------|--------------------------------------------------------------------------------------------------------------------------|-------------|----------| | **GRPO with Phi-4 (14B)** | ▶️ Start on Colab-GRPO.ipynb) | 2x faster | 80% less | | **Llama-3.2 (3B)** | ▶️ Start on Colab-Conversational.ipynb) | 2.4x faster | 58% less | | **Llama-3.2 (11B vision)** | ▶️ Start on Colab-Vision.ipynb) | 2x faster | 60% less | | **Qwen2 VL (7B)** | ▶️ Start on Colab-Vision.ipynb) | 1.8x faster | 60% less | | **Qwen2.5 (7B)** | ▶️ Start on Colab-Alpaca.ipynb) | 2x faster | 60% less | | **Llama-3.1 (8B)** | ▶️ Start on Colab-Alpaca.ipynb) | 2.4x faster | 58% less | | **Phi-3.5 (mini)** | ▶️ Start on Colab | 2x faster | 50% less | | **Gemma 2 (9B)** | ▶️ Start on Colab-Alpaca.ipynb) | 2.4x faster | 58% less | | **Mistral (7B)** | ▶️ Start on Colab-Conversational.ipynb) | 2.2x faster | 62% less |
## Features DeepSeek-V3-0324 demonstrates notable improvements over its predecessor, DeepSeek-V3, in several key aspects. !Model Performance ### Reasoning Capabilities - Significant improvements in benchmark performance: - MMLU-Pro: 75.9 → 81.2 (+5.3) - GPQA: 59.1 → 68.4 (+9.3) - AIME: 39.6 → 59.4 (+19.8) - LiveCodeBench: 39.2 → 49.2 (+10.0) ### Front-End Web Development - Improved the executability of the code - More aesthetically pleasing web pages and game front-ends ### Chinese Writing Proficiency - Enhanced style and content quality: - Aligned with the R1 writing style - Better quality in medium-to-long-form writing - Feature Enhancements - Improved multi-turn interactive rewriting - Optimized translation quality and letter writing ### Chinese Search Capabilities - Enhanced report analysis requests with more detailed outputs ### Function Calling Improvements - Increased accuracy in Function Calling, fixing issues from previous V3 versions --- ## Usage Recommendations ### System Prompt In the official DeepSeek web/app, we use the same system prompt with a specific date. For example, ### Temperature In our web and application environments, the temperature parameter $T_{model}$ is set to 0.3. Because many users use the default temperature 1.0 in API call, we have implemented an API temperature $T_{api}$ mapping mechanism that adjusts the input API temperature value of 1.0 to the most suitable model temperature setting of 0.3. $$ T_{model} = T_{api} \\times 0.3 \\quad (0 \\leq T_{api} \\leq 1) $$ $$ T_{model} = T_{api} - 0.7 \\quad (1 < T_{api} \\leq 2) $$ Thus, if you call V3 via API, temperature 1.0 equals to the model temperature 0.3. ### Prompts for File Uploading and Web Search For file uploading, please follow the template to create prompts, where {file_name}, {file_content} and {question} are arguments. For Web Search, {search_results}, {cur_date}, and {question} are arguments. For Chinese query, we use the prompt: For English query, we use the prompt: ## How to Run Locally The model structure of DeepSeek-V3-0324 is exactly the same as DeepSeek-V3. Please visit DeepSeek-V3 repo for more information about running this model locally. **This model supports features such as function calling, JSON output, and FIM completion. For instructions on how to construct prompts to use these features, please refer to DeepSeek-V2.5 repo.** **NOTE: Hugging Face's Transformers has not been directly supported yet.** ## License This repository and the model weights are licensed under the MIT License. ## Citation ## Contact If you have any questions, please raise an issue or contact us at service@deepseek.com.",
+ "model_explanation_gemini": "A quantized version of DeepSeek-V3-0324 optimized for efficient local inference with improved accuracy through selective quantization, compatible with llama.cpp and other frameworks. \n\n**Features:** \n- Multiple quantization levels (1-4 bit) with dynamic versions for better accuracy \n- Optimized for local inference (llama.cpp, LMStudio, Open WebUI) \n- Recommended variants: 2.42-bit (IQ2_XXS) and 2.71-bit (Q2_K"
+}
\ No newline at end of file
diff --git a/model_data_json/unsloth_Llama-3.2-1B-Instruct-unsloth-bnb-4bit.json b/model_data_json/unsloth_Llama-3.2-1B-Instruct-unsloth-bnb-4bit.json
new file mode 100644
index 0000000000000000000000000000000000000000..d609fee5b90011a9da7928d85a1defde5804ea8e
--- /dev/null
+++ b/model_data_json/unsloth_Llama-3.2-1B-Instruct-unsloth-bnb-4bit.json
@@ -0,0 +1,27 @@
+{
+ "model_id": "unsloth/Llama-3.2-1B-Instruct-unsloth-bnb-4bit",
+ "downloads": 68982,
+ "tags": [
+ "transformers",
+ "safetensors",
+ "llama",
+ "text-generation",
+ "llama-3",
+ "meta",
+ "facebook",
+ "unsloth",
+ "conversational",
+ "en",
+ "base_model:meta-llama/Llama-3.2-1B-Instruct",
+ "base_model:quantized:meta-llama/Llama-3.2-1B-Instruct",
+ "license:llama3.2",
+ "autotrain_compatible",
+ "text-generation-inference",
+ "endpoints_compatible",
+ "4-bit",
+ "bitsandbytes",
+ "region:us"
+ ],
+ "description": "--- base_model: meta-llama/Llama-3.2-1B-Instruct language: - en library_name: transformers license: llama3.2 tags: - llama-3 - llama - meta - facebook - unsloth - transformers --- ## ***See our collection for all versions of Llama 3.2 including GGUF, 4-bit and original 16-bit formats.*** *Dynamic 4-bit: Unsloth's Dynamic 4-bit Quants selectively avoids quantizing certain parameters, greatly increase accuracy than standard 4-bit. See our full collection of Unsloth quants on Hugging Face here.* # Finetune Llama 3.2, Gemma 2, Mistral 2-5x faster with 70% less memory via Unsloth! We have a free Google Colab Tesla T4 notebook for Llama 3.2 (3B) here: unsloth/Llama-3.2-1B-Instruct-unsloth-bnb-4bit For more details on the model, please go to Meta's original model card ## ✨ Finetune for Free All notebooks are **beginner friendly**! Add your dataset, click \"Run All\", and you'll get a 2x faster finetuned model which can be exported to GGUF, vLLM or uploaded to Hugging Face. | Unsloth supports | Free Notebooks | Performance | Memory use | |-----------------|--------------------------------------------------------------------------------------------------------------------------|-------------|----------| | **Llama-3.2 (3B)** | ▶️ Start on Colab-Conversational.ipynb) | 2.4x faster | 58% less | | **Llama-3.2 (11B vision)** | ▶️ Start on Colab-Vision.ipynb) | 2x faster | 60% less | | **Qwen2 VL (7B)** | ▶️ Start on Colab-Vision.ipynb) | 1.8x faster | 60% less | | **Qwen2.5 (7B)** | ▶️ Start on Colab-Alpaca.ipynb) | 2x faster | 60% less | | **Llama-3.1 (8B)** | ▶️ Start on Colab-Alpaca.ipynb) | 2.4x faster | 58% less | | **Phi-3.5 (mini)** | ▶️ Start on Colab | 2x faster | 50% less | | **Gemma 2 (9B)** | ▶️ Start on Colab-Alpaca.ipynb) | 2.4x faster | 58% less | | **Mistral (7B)** | ▶️ Start on Colab-Conversational.ipynb) | 2.2x faster | 62% less | - This Llama 3.2 conversational notebook-Conversational.ipynb) is useful for ShareGPT ChatML / Vicuna templates. - This text completion notebook-Text_Completion.ipynb) is for raw text. This DPO notebook replicates Zephyr. - \\* Kaggle has 2x T4s, but we use 1. Due to overhead, 1x T4 is 5x faster. ## Special Thanks A huge thank you to the Meta and Llama team for creating and releasing these models. ## Model Information The Meta Llama 3.2 collection of multilingual large language models (LLMs) is a collection of pretrained and instruction-tuned generative models in 1B and 3B sizes (text in/text out). The Llama 3.2 instruction-tuned text only models are optimized for multilingual dialogue use cases, including agentic retrieval and summarization tasks. They outperform many of the available open source and closed chat models on common industry benchmarks. **Model developer**: Meta **Model Architecture:** Llama 3.2 is an auto-regressive language model that uses an optimized transformer architecture. The tuned versions use supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) to align with human preferences for helpfulness and safety. **Supported languages:** English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai are officially supported. Llama 3.2 has been trained on a broader collection of languages than these 8 supported languages. Developers may fine-tune Llama 3.2 models for languages beyond these supported languages, provided they comply with the Llama 3.2 Community License and the Acceptable Use Policy. Developers are always expected to ensure that their deployments, including those that involve additional languages, are completed safely and responsibly. **Llama 3.2 family of models** Token counts refer to pretraining data only. All model versions use Grouped-Query Attention (GQA) for improved inference scalability. **Model Release Date:** Sept 25, 2024 **Status:** This is a static model trained on an offline dataset. Future versions may be released that improve model capabilities and safety. **License:** Use of Llama 3.2 is governed by the Llama 3.2 Community License (a custom, commercial license agreement). Where to send questions or comments about the model Instructions on how to provide feedback or comments on the model can be found in the model README. For more technical information about generation parameters and recipes for how to use Llama 3.1 in applications, please go here.",
+ "model_explanation_gemini": "A 4-bit quantized version of Meta's Llama-3.2-1B-Instruct model optimized for faster inference and lower memory usage while maintaining higher accuracy than standard 4-bit quantization, designed for multilingual dialogue tasks like retrieval and summarization. \n\n**Features:** \n- Dynamic 4-bit quantization (selective parameter quantization for improved accuracy) \n- Optimized for multilingual conversational use (English, German, French, etc.) \n- 70% less memory usage and 2"
+}
\ No newline at end of file
diff --git a/model_data_json/unsloth_Llama-3.2-3B-Instruct-bnb-4bit.json b/model_data_json/unsloth_Llama-3.2-3B-Instruct-bnb-4bit.json
new file mode 100644
index 0000000000000000000000000000000000000000..6483145c731dc979074f7fdf4ebf974df53c7af7
--- /dev/null
+++ b/model_data_json/unsloth_Llama-3.2-3B-Instruct-bnb-4bit.json
@@ -0,0 +1,27 @@
+{
+ "model_id": "unsloth/Llama-3.2-3B-Instruct-bnb-4bit",
+ "downloads": 69409,
+ "tags": [
+ "transformers",
+ "safetensors",
+ "llama",
+ "text-generation",
+ "llama-3",
+ "meta",
+ "facebook",
+ "unsloth",
+ "conversational",
+ "en",
+ "base_model:meta-llama/Llama-3.2-3B-Instruct",
+ "base_model:quantized:meta-llama/Llama-3.2-3B-Instruct",
+ "license:llama3.2",
+ "autotrain_compatible",
+ "text-generation-inference",
+ "endpoints_compatible",
+ "4-bit",
+ "bitsandbytes",
+ "region:us"
+ ],
+ "description": "--- base_model: meta-llama/Llama-3.2-3B-Instruct language: - en library_name: transformers license: llama3.2 tags: - llama-3 - llama - meta - facebook - unsloth - transformers --- ## ***See our collection for all versions of Llama 3.2 including GGUF, 4-bit and original 16-bit formats.*** # Finetune Llama 3.2, Gemma 2, Mistral 2-5x faster with 70% less memory via Unsloth! We have a free Google Colab Tesla T4 notebook for Llama 3.2 (3B) here: # unsloth/Llama-3.2-3B-Instruct-bnb-4bit For more details on the model, please go to Meta's original model card ## ✨ Finetune for Free All notebooks are **beginner friendly**! Add your dataset, click \"Run All\", and you'll get a 2x faster finetuned model which can be exported to GGUF, vLLM or uploaded to Hugging Face. | Unsloth supports | Free Notebooks | Performance | Memory use | |-----------------|--------------------------------------------------------------------------------------------------------------------------|-------------|----------| | **Llama-3.2 (3B)** | ▶️ Start on Colab | 2.4x faster | 58% less | | **Llama-3.1 (11B vision)** | ▶️ Start on Colab | 2.4x faster | 58% less | | **Llama-3.1 (8B)** | ▶️ Start on Colab | 2.4x faster | 58% less | | **Phi-3.5 (mini)** | ▶️ Start on Colab | 2x faster | 50% less | | **Gemma 2 (9B)** | ▶️ Start on Colab | 2.4x faster | 58% less | | **Mistral (7B)** | ▶️ Start on Colab | 2.2x faster | 62% less | | **DPO - Zephyr** | ▶️ Start on Colab | 1.9x faster | 19% less | - This conversational notebook is useful for ShareGPT ChatML / Vicuna templates. - This text completion notebook is for raw text. This DPO notebook replicates Zephyr. - \\* Kaggle has 2x T4s, but we use 1. Due to overhead, 1x T4 is 5x faster. ## Special Thanks A huge thank you to the Meta and Llama team for creating and releasing these models. ## Model Information The Meta Llama 3.2 collection of multilingual large language models (LLMs) is a collection of pretrained and instruction-tuned generative models in 1B and 3B sizes (text in/text out). The Llama 3.2 instruction-tuned text only models are optimized for multilingual dialogue use cases, including agentic retrieval and summarization tasks. They outperform many of the available open source and closed chat models on common industry benchmarks. **Model developer**: Meta **Model Architecture:** Llama 3.2 is an auto-regressive language model that uses an optimized transformer architecture. The tuned versions use supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) to align with human preferences for helpfulness and safety. **Supported languages:** English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai are officially supported. Llama 3.2 has been trained on a broader collection of languages than these 8 supported languages. Developers may fine-tune Llama 3.2 models for languages beyond these supported languages, provided they comply with the Llama 3.2 Community License and the Acceptable Use Policy. Developers are always expected to ensure that their deployments, including those that involve additional languages, are completed safely and responsibly. **Llama 3.2 family of models** Token counts refer to pretraining data only. All model versions use Grouped-Query Attention (GQA) for improved inference scalability. **Model Release Date:** Sept 25, 2024 **Status:** This is a static model trained on an offline dataset. Future versions may be released that improve model capabilities and safety. **License:** Use of Llama 3.2 is governed by the Llama 3.2 Community License (a custom, commercial license agreement). Where to send questions or comments about the model Instructions on how to provide feedback or comments on the model can be found in the model README. For more technical information about generation parameters and recipes for how to use Llama 3.1 in applications, please go here.",
+ "model_explanation_gemini": "A 4-bit quantized version of Meta's Llama-3.2-3B-Instruct model optimized for efficient multilingual dialogue tasks, offering faster finetuning and lower memory usage via Unsloth.\n\n**Features:** \n- Base model: Meta's Llama-3.2-3B-Instruct \n- 4-bit quantization (bnb-4bit) \n- Optimized for multilingual dialogue (English, German, French, etc.) \n- Faster finetuning (2"
+}
\ No newline at end of file
diff --git a/model_data_json/unsloth_Meta-Llama-3.1-8B-bnb-4bit.json b/model_data_json/unsloth_Meta-Llama-3.1-8B-bnb-4bit.json
new file mode 100644
index 0000000000000000000000000000000000000000..26bc4b2eb1f4480b17d2b9d0e553532f1322838a
--- /dev/null
+++ b/model_data_json/unsloth_Meta-Llama-3.1-8B-bnb-4bit.json
@@ -0,0 +1,27 @@
+{
+ "model_id": "unsloth/Meta-Llama-3.1-8B-bnb-4bit",
+ "downloads": 80572,
+ "tags": [
+ "transformers",
+ "safetensors",
+ "llama",
+ "text-generation",
+ "llama-3",
+ "meta",
+ "facebook",
+ "unsloth",
+ "en",
+ "arxiv:2204.05149",
+ "base_model:meta-llama/Llama-3.1-8B",
+ "base_model:quantized:meta-llama/Llama-3.1-8B",
+ "license:llama3.1",
+ "autotrain_compatible",
+ "text-generation-inference",
+ "endpoints_compatible",
+ "4-bit",
+ "bitsandbytes",
+ "region:us"
+ ],
+ "description": "--- base_model: meta-llama/Meta-Llama-3.1-8B language: - en library_name: transformers license: llama3.1 tags: - llama-3 - llama - meta - facebook - unsloth - transformers --- # Finetune Llama 3.2, Gemma 2, Mistral 2-5x faster with 70% less memory via Unsloth! We have a free Google Colab Tesla T4 notebook for Llama 3.1 (8B) here: ## ✨ Finetune for Free All notebooks are **beginner friendly**! Add your dataset, click \"Run All\", and you'll get a 2x faster finetuned model which can be exported to GGUF, vLLM or uploaded to Hugging Face. | Unsloth supports | Free Notebooks | Performance | Memory use | |-----------------|--------------------------------------------------------------------------------------------------------------------------|-------------|----------| | **Llama-3.2 (3B)** | ▶️ Start on Colab | 2.4x faster | 58% less | | **Llama-3.2 (11B vision)** | ▶️ Start on Colab | 2x faster | 60% less | | **Llama-3.1 (8B)** | ▶️ Start on Colab | 2.4x faster | 58% less | | **Qwen2 VL (7B)** | ▶️ Start on Colab | 1.8x faster | 60% less | | **Qwen2.5 (7B)** | ▶️ Start on Colab | 2x faster | 60% less | | **Phi-3.5 (mini)** | ▶️ Start on Colab | 2x faster | 50% less | | **Gemma 2 (9B)** | ▶️ Start on Colab | 2.4x faster | 58% less | | **Mistral (7B)** | ▶️ Start on Colab | 2.2x faster | 62% less | | **DPO - Zephyr** | ▶️ Start on Colab | 1.9x faster | 19% less | - This conversational notebook is useful for ShareGPT ChatML / Vicuna templates. - This text completion notebook is for raw text. This DPO notebook replicates Zephyr. - \\* Kaggle has 2x T4s, but we use 1. Due to overhead, 1x T4 is 5x faster. ## Special Thanks A huge thank you to the Meta and Llama team for creating and releasing these models. ## Model Information The Meta Llama 3.1 collection of multilingual large language models (LLMs) is a collection of pretrained and instruction tuned generative models in 8B, 70B and 405B sizes (text in/text out). The Llama 3.1 instruction tuned text only models (8B, 70B, 405B) are optimized for multilingual dialogue use cases and outperform many of the available open source and closed chat models on common industry benchmarks. **Model developer**: Meta **Model Architecture:** Llama 3.1 is an auto-regressive language model that uses an optimized transformer architecture. The tuned versions use supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) to align with human preferences for helpfulness and safety.
Training Data
Params
Input modalities
Output modalities
Context length
GQA
Token count
Knowledge cutoff
Llama 3.1 (text only)
A new mix of publicly available online data.
8B
Multilingual Text
Multilingual Text and code
128k
Yes
15T+
December 2023
70B
Multilingual Text
Multilingual Text and code
128k
Yes
405B
Multilingual Text
Multilingual Text and code
128k
Yes
**Supported languages:** English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai. **Llama 3.1 family of models**. Token counts refer to pretraining data only. All model versions use Grouped-Query Attention (GQA) for improved inference scalability. **Model Release Date:** July 23, 2024. **Status:** This is a static model trained on an offline dataset. Future versions of the tuned models will be released as we improve model safety with community feedback. **License:** A custom commercial license, the Llama 3.1 Community License, is available at: Where to send questions or comments about the model Instructions on how to provide feedback or comments on the model can be found in the model README. For more technical information about generation parameters and recipes for how to use Llama 3.1 in applications, please go here. ## Intended Use **Intended Use Cases** Llama 3.1 is intended for commercial and research use in multiple languages. Instruction tuned text only models are intended for assistant-like chat, whereas pretrained models can be adapted for a variety of natural language generation tasks. The Llama 3.1 model collection also supports the ability to leverage the outputs of its models to improve other models including synthetic data generation and distillation. The Llama 3.1 Community License allows for these use cases. **Out-of-scope** Use in any manner that violates applicable laws or regulations (including trade compliance laws). Use in any other way that is prohibited by the Acceptable Use Policy and Llama 3.1 Community License. Use in languages beyond those explicitly referenced as supported in this model card**. **Note: Llama 3.1 has been trained on a broader collection of languages than the 8 supported languages. Developers may fine-tune Llama 3.1 models for languages beyond the 8 supported languages provided they comply with the Llama 3.1 Community License and the Acceptable Use Policy and in such cases are responsible for ensuring that any uses of Llama 3.1 in additional languages is done in a safe and responsible manner. ## How to use This repository contains two versions of Meta-Llama-3.1-8B-Instruct, for use with transformers and with the original codebase. ### Use with transformers Starting with onward, you can run conversational inference using the Transformers abstraction or by leveraging the Auto classes with the function. Make sure to update your transformers installation via . Note: You can also find detailed recipes on how to use the model locally, with , assisted generations, quantised and more at []( ### Use with Please, follow the instructions in the repository To download Original checkpoints, see the example command below leveraging : ## Hardware and Software **Training Factors** We used custom training libraries, Meta's custom built GPU cluster, and production infrastructure for pretraining. Fine-tuning, annotation, and evaluation were also performed on production infrastructure. **Training utilized a cumulative of** 39.3M GPU hours of computation on H100-80GB (TDP of 700W) type hardware, per the table below. Training time is the total GPU time required for training each model and power consumption is the peak power capacity per GPU device used, adjusted for power usage efficiency. **Training Greenhouse Gas Emissions** Estimated total location-based greenhouse gas emissions were **11,390** tons CO2eq for training. Since 2020, Meta has maintained net zero greenhouse gas emissions in its global operations and matched 100% of its electricity use with renewable energy, therefore the total market-based greenhouse gas emissions for training were 0 tons CO2eq.
Training Time (GPU hours)
Training Power Consumption (W)
Training Location-Based Greenhouse Gas Emissions
(tons CO2eq)
Training Market-Based Greenhouse Gas Emissions
(tons CO2eq)
Llama 3.1 8B
1.46M
700
420
0
Llama 3.1 70B
7.0M
700
2,040
0
Llama 3.1 405B
30.84M
700
8,930
0
Total
39.3M
11,390
0
The methodology used to determine training energy use and greenhouse gas emissions can be found here. Since Meta is openly releasing these models, the training energy use and greenhouse gas emissions will not be incurred by others. ## Training Data **Overview:** Llama 3.1 was pretrained on ~15 trillion tokens of data from publicly available sources. The fine-tuning data includes publicly available instruction datasets, as well as over 25M synthetically generated examples. **Data Freshness:** The pretraining data has a cutoff of December 2023. ## Benchmark scores In this section, we report the results for Llama 3.1 models on standard automatic benchmarks. For all the evaluations, we use our internal evaluations library. ### Base pretrained models
Category
Benchmark
# Shots
Metric
Llama 3 8B
Llama 3.1 8B
Llama 3 70B
Llama 3.1 70B
Llama 3.1 405B
General
MMLU
5
macro_avg/acc_char
66.7
66.7
79.5
79.3
85.2
MMLU-Pro (CoT)
5
macro_avg/acc_char
36.2
37.1
55.0
53.8
61.6
AGIEval English
3-5
average/acc_char
47.1
47.8
63.0
64.6
71.6
CommonSenseQA
7
acc_char
72.6
75.0
83.8
84.1
85.8
Winogrande
5
acc_char
-
60.5
-
83.3
86.7
BIG-Bench Hard (CoT)
3
average/em
61.1
64.2
81.3
81.6
85.9
ARC-Challenge
25
acc_char
79.4
79.7
93.1
92.9
96.1
Knowledge reasoning
TriviaQA-Wiki
5
em
78.5
77.6
89.7
89.8
91.8
Reading comprehension
SQuAD
1
em
76.4
77.0
85.6
81.8
89.3
QuAC (F1)
1
f1
44.4
44.9
51.1
51.1
53.6
BoolQ
0
acc_char
75.7
75.0
79.0
79.4
80.0
DROP (F1)
3
f1
58.4
59.5
79.7
79.6
84.8
### Instruction tuned models
Category
Benchmark
# Shots
Metric
Llama 3 8B Instruct
Llama 3.1 8B Instruct
Llama 3 70B Instruct
Llama 3.1 70B Instruct
Llama 3.1 405B Instruct
General
MMLU
5
macro_avg/acc
68.5
69.4
82.0
83.6
87.3
MMLU (CoT)
0
macro_avg/acc
65.3
73.0
80.9
86.0
88.6
MMLU-Pro (CoT)
5
micro_avg/acc_char
45.5
48.3
63.4
66.4
73.3
IFEval
76.8
80.4
82.9
87.5
88.6
Reasoning
ARC-C
0
acc
82.4
83.4
94.4
94.8
96.9
GPQA
0
em
34.6
30.4
39.5
41.7
50.7
Code
HumanEval
0
pass@1
60.4
72.6
81.7
80.5
89.0
MBPP ++ base version
0
pass@1
70.6
72.8
82.5
86.0
88.6
Multipl-E HumanEval
0
pass@1
-
50.8
-
65.5
75.2
Multipl-E MBPP
0
pass@1
-
52.4
-
62.0
65.7
Math
GSM-8K (CoT)
8
em_maj1@1
80.6
84.5
93.0
95.1
96.8
MATH (CoT)
0
final_em
29.1
51.9
51.0
68.0
73.8
Tool Use
API-Bank
0
acc
48.3
82.6
85.1
90.0
92.0
BFCL
0
acc
60.3
76.1
83.0
84.8
88.5
Gorilla Benchmark API Bench
0
acc
1.7
8.2
14.7
29.7
35.3
Nexus (0-shot)
0
macro_avg/acc
18.1
38.5
47.8
56.7
58.7
Multilingual
Multilingual MGSM (CoT)
0
em
-
68.9
-
86.9
91.6
#### Multilingual benchmarks
Category
Benchmark
Language
Llama 3.1 8B
Llama 3.1 70B
Llama 3.1 405B
General
MMLU (5-shot, macro_avg/acc)
Portuguese
62.12
80.13
84.95
Spanish
62.45
80.05
85.08
Italian
61.63
80.4
85.04
German
60.59
79.27
84.36
French
62.34
79.82
84.66
Hindi
50.88
74.52
80.31
Thai
50.32
72.95
78.21
## Responsibility & Safety As part of our Responsible release approach, we followed a three-pronged strategy to managing trust & safety risks: * Enable developers to deploy helpful, safe and flexible experiences for their target audience and for the use cases supported by Llama. * Protect developers against adversarial users aiming to exploit Llama capabilities to potentially cause harm. * Provide protections for the community to help prevent the misuse of our models. ### Responsible deployment Llama is a foundational technology designed to be used in a variety of use cases, examples on how Meta’s Llama models have been responsibly deployed can be found in our Community Stories webpage. Our approach is to build the most helpful models enabling the world to benefit from the technology power, by aligning our model safety for the generic use cases addressing a standard set of harms. Developers are then in the driver seat to tailor safety for their use case, defining their own policy and deploying the models with the necessary safeguards in their Llama systems. Llama 3.1 was developed following the best practices outlined in our Responsible Use Guide, you can refer to the Responsible Use Guide to learn more. #### Llama 3.1 instruct Our main objectives for conducting safety fine-tuning are to provide the research community with a valuable resource for studying the robustness of safety fine-tuning, as well as to offer developers a readily available, safe, and powerful model for various applications to reduce the developer workload to deploy safe AI systems. For more details on the safety mitigations implemented please read the Llama 3 paper. **Fine-tuning data** We employ a multi-faceted approach to data collection, combining human-generated data from our vendors with synthetic data to mitigate potential safety risks. We’ve developed many large language model (LLM)-based classifiers that enable us to thoughtfully select high-quality prompts and responses, enhancing data quality control. **Refusals and Tone** Building on the work we started with Llama 3, we put a great emphasis on model refusals to benign prompts as well as refusal tone. We included both borderline and adversarial prompts in our safety data strategy, and modified our safety data responses to follow tone guidelines. #### Llama 3.1 systems **Large language models, including Llama 3.1, are not designed to be deployed in isolation but instead should be deployed as part of an overall AI system with additional safety guardrails as required.** Developers are expected to deploy system safeguards when building agentic systems. Safeguards are key to achieve the right helpfulness-safety alignment as well as mitigating safety and security risks inherent to the system and any integration of the model or system with external tools. As part of our responsible release approach, we provide the community with safeguards that developers should deploy with Llama models or other LLMs, including Llama Guard 3, Prompt Guard and Code Shield. All our reference implementations demos contain these safeguards by default so developers can benefit from system-level safety out-of-the-box. #### New capabilities Note that this release introduces new capabilities, including a longer context window, multilingual inputs and outputs and possible integrations by developers with third party tools. Building with these new capabilities requires specific considerations in addition to the best practices that generally apply across all Generative AI use cases. **Tool-use**: Just like in standard software development, developers are responsible for the integration of the LLM with the tools and services of their choice. They should define a clear policy for their use case and assess the integrity of the third party services they use to be aware of the safety and security limitations when using this capability. Refer to the Responsible Use Guide for best practices on the safe deployment of the third party safeguards. **Multilinguality**: Llama 3.1 supports 7 languages in addition to English: French, German, Hindi, Italian, Portuguese, Spanish, and Thai. Llama may be able to output text in other languages than those that meet performance thresholds for safety and helpfulness. We strongly discourage developers from using this model to converse in non-supported languages without implementing finetuning and system controls in alignment with their policies and the best practices shared in the Responsible Use Guide. ### Evaluations We evaluated Llama models for common use cases as well as specific capabilities. Common use cases evaluations measure safety risks of systems for most commonly built applications including chat bot, coding assistant, tool calls. We built dedicated, adversarial evaluation datasets and evaluated systems composed of Llama models and Llama Guard 3 to filter input prompt and output response. It is important to evaluate applications in context, and we recommend building dedicated evaluation dataset for your use case. Prompt Guard and Code Shield are also available if relevant to the application. Capability evaluations measure vulnerabilities of Llama models inherent to specific capabilities, for which were crafted dedicated benchmarks including long context, multilingual, tools calls, coding or memorization. **Red teaming** For both scenarios, we conducted recurring red teaming exercises with the goal of discovering risks via adversarial prompting and we used the learnings to improve our benchmarks and safety tuning datasets. We partnered early with subject-matter experts in critical risk areas to understand the nature of these real-world harms and how such models may lead to unintended harm for society. Based on these conversations, we derived a set of adversarial goals for the red team to attempt to achieve, such as extracting harmful information or reprogramming the model to act in a potentially harmful capacity. The red team consisted of experts in cybersecurity, adversarial machine learning, responsible AI, and integrity in addition to multilingual content specialists with background in integrity issues in specific geographic markets. ### Critical and other risks We specifically focused our efforts on mitigating the following critical risk areas: **1- CBRNE (Chemical, Biological, Radiological, Nuclear, and Explosive materials) helpfulness** To assess risks related to proliferation of chemical and biological weapons, we performed uplift testing designed to assess whether use of Llama 3.1 models could meaningfully increase the capabilities of malicious actors to plan or carry out attacks using these types of weapons. **2. Child Safety** Child Safety risk assessments were conducted using a team of experts, to assess the model’s capability to produce outputs that could result in Child Safety risks and inform on any necessary and appropriate risk mitigations via fine tuning. We leveraged those expert red teaming sessions to expand the coverage of our evaluation benchmarks through Llama 3 model development. For Llama 3, we conducted new in-depth sessions using objective based methodologies to assess the model risks along multiple attack vectors including the additional languages Llama 3 is trained on. We also partnered with content specialists to perform red teaming exercises assessing potentially violating content while taking account of market specific nuances or experiences. **3. Cyber attack enablement** Our cyber attack uplift study investigated whether LLMs can enhance human capabilities in hacking tasks, both in terms of skill level and speed. Our attack automation study focused on evaluating the capabilities of LLMs when used as autonomous agents in cyber offensive operations, specifically in the context of ransomware attacks. This evaluation was distinct from previous studies that considered LLMs as interactive assistants. The primary objective was to assess whether these models could effectively function as independent agents in executing complex cyber-attacks without human intervention. Our study of Llama-3.1-405B’s social engineering uplift for cyber attackers was conducted to assess the effectiveness of AI models in aiding cyber threat actors in spear phishing campaigns. Please read our Llama 3.1 Cyber security whitepaper to learn more. ### Community Generative AI safety requires expertise and tooling, and we believe in the strength of the open community to accelerate its progress. We are active members of open consortiums, including the AI Alliance, Partnership on AI and MLCommons, actively contributing to safety standardization and transparency. We encourage the community to adopt taxonomies like the MLCommons Proof of Concept evaluation to facilitate collaboration and transparency on safety and content evaluations. Our Purple Llama tools are open sourced for the community to use and widely distributed across ecosystem partners including cloud service providers. We encourage community contributions to our Github repository. We also set up the Llama Impact Grants program to identify and support the most compelling applications of Meta’s Llama model for societal benefit across three categories: education, climate and open innovation. The 20 finalists from the hundreds of applications can be found here. Finally, we put in place a set of resources including an output reporting mechanism and bug bounty program to continuously improve the Llama technology with the help of the community. ## Ethical Considerations and Limitations The core values of Llama 3.1 are openness, inclusivity and helpfulness. It is meant to serve everyone, and to work for a wide range of use cases. It is thus designed to be accessible to people across many different backgrounds, experiences and perspectives. Llama 3.1 addresses users and their needs as they are, without insertion unnecessary judgment or normativity, while reflecting the understanding that even content that may appear problematic in some cases can serve valuable purposes in others. It respects the dignity and autonomy of all users, especially in terms of the values of free thought and expression that power innovation and progress. But Llama 3.1 is a new technology, and like any new technology, there are risks associated with its use. Testing conducted to date has not covered, nor could it cover, all scenarios. For these reasons, as with all LLMs, Llama 3.1’s potential outputs cannot be predicted in advance, and the model may in some instances produce inaccurate, biased or other objectionable responses to user prompts. Therefore, before deploying any applications of Llama 3.1 models, developers should perform safety testing and tuning tailored to their specific applications of the model. Please refer to available resources including our Responsible Use Guide, Trust and Safety solutions, and other resources to learn more about responsible development.",
+ "model_explanation_gemini": "A 4-bit quantized version of Meta's Llama-3.1-8B model optimized for faster fine-tuning and reduced memory usage via Unsloth, primarily designed for multilingual text/code generation and dialogue tasks.\n\n**Features:** \n- Base model: Meta-Llama-3.1-8B \n- 4-bit quantization (bnb-4bit) \n- Optimized for 2.4x faster fine-tuning with 58% less memory \n- Supports multilingual"
+}
\ No newline at end of file
diff --git a/model_data_json/unsloth_Mistral-Small-24B-Instruct-2501-unsloth-bnb-4bit.json b/model_data_json/unsloth_Mistral-Small-24B-Instruct-2501-unsloth-bnb-4bit.json
new file mode 100644
index 0000000000000000000000000000000000000000..c8be66a252cd32d00e35b379da6e929dcf718ed2
--- /dev/null
+++ b/model_data_json/unsloth_Mistral-Small-24B-Instruct-2501-unsloth-bnb-4bit.json
@@ -0,0 +1,26 @@
+{
+ "model_id": "unsloth/Mistral-Small-24B-Instruct-2501-unsloth-bnb-4bit",
+ "downloads": 77836,
+ "tags": [
+ "transformers",
+ "safetensors",
+ "mistral",
+ "text-generation",
+ "unsloth",
+ "mistral-instruct",
+ "instruct",
+ "conversational",
+ "en",
+ "base_model:mistralai/Mistral-Small-24B-Instruct-2501",
+ "base_model:quantized:mistralai/Mistral-Small-24B-Instruct-2501",
+ "license:apache-2.0",
+ "autotrain_compatible",
+ "text-generation-inference",
+ "endpoints_compatible",
+ "4-bit",
+ "bitsandbytes",
+ "region:us"
+ ],
+ "description": "--- language: - en library_name: transformers license: apache-2.0 tags: - unsloth - transformers - mistral - mistral-instruct - instruct base_model: mistralai/Mistral-Small-24B-Instruct-2501 --- # Finetune LLMs 2-5x faster with 70% less memory via Unsloth! We have a free Google Colab Tesla T4 notebook for Mistral (7B) here: ## ✨ Finetune for Free All notebooks are **beginner friendly**! Add your dataset, click \"Run All\", and you'll get a 2x faster finetuned model which can be exported to GGUF, vLLM or uploaded to Hugging Face. | Unsloth supports | Free Notebooks | Performance | Memory use | |-----------------|--------------------------------------------------------------------------------------------------------------------------|-------------|----------| | **Llama-3.2 (3B)** | ▶️ Start on Colab-Conversational.ipynb) | 2.4x faster | 58% less | | **Llama-3.2 (11B vision)** | ▶️ Start on Colab-Vision.ipynb) | 2x faster | 60% less | | **Qwen2 VL (7B)** | ▶️ Start on Colab-Vision.ipynb) | 1.8x faster | 60% less | | **Qwen2.5 (7B)** | ▶️ Start on Colab-Alpaca.ipynb) | 2x faster | 60% less | | **Llama-3.1 (8B)** | ▶️ Start on Colab-Alpaca.ipynb) | 2.4x faster | 58% less | | **Phi-3.5 (mini)** | ▶️ Start on Colab | 2x faster | 50% less | | **Gemma 2 (9B)** | ▶️ Start on Colab-Alpaca.ipynb) | 2.4x faster | 58% less | | **Mistral (7B)** | ▶️ Start on Colab-Conversational.ipynb) | 2.2x faster | 62% less | - This Llama 3.2 conversational notebook-Conversational.ipynb) is useful for ShareGPT ChatML / Vicuna templates. - This text completion notebook-Text_Completion.ipynb) is for raw text. This DPO notebook replicates Zephyr. - \\* Kaggle has 2x T4s, but we use 1. Due to overhead, 1x T4 is 5x faster. # Model Card for Mistral-Small-24B-Instruct-2501 Mistral Small 3 ( 2501 ) sets a new benchmark in the \"small\" Large Language Models category below 70B, boasting 24B parameters and achieving state-of-the-art capabilities comparable to larger models! This model is an instruction-fine-tuned version of the base model: Mistral-Small-24B-Base-2501. Mistral Small can be deployed locally and is exceptionally \"knowledge-dense\", fitting in a single RTX 4090 or a 32GB RAM MacBook once quantized. Perfect for: - Fast response conversational agents. - Low latency function calling. - Subject matter experts via fine-tuning. - Local inference for hobbyists and organizations handling sensitive data. For enterprises that need specialized capabilities (increased context, particular modalities, domain specific knowledge, etc.), we will be releasing commercial models beyond what Mistral AI contributes to the community. This release demonstrates our commitment to open source, serving as a strong base model. Learn more about Mistral Small in our blog post. Model developper: Mistral AI Team ## Key Features - **Multilingual:** Supports dozens of languages, including English, French, German, Spanish, Italian, Chinese, Japanese, Korean, Portuguese, Dutch, and Polish. - **Agent-Centric:** Offers best-in-class agentic capabilities with native function calling and JSON outputting. - **Advanced Reasoning:** State-of-the-art conversational and reasoning capabilities. - **Apache 2.0 License:** Open license allowing usage and modification for both commercial and non-commercial purposes. - **Context Window:** A 32k context window. - **System Prompt:** Maintains strong adherence and support for system prompts. - **Tokenizer:** Utilizes a Tekken tokenizer with a 131k vocabulary size. ## Benchmark results ### Human evaluated benchmarks | Category | Gemma-2-27B | Qwen-2.5-32B | Llama-3.3-70B | Gpt4o-mini | |----------|-------------|--------------|---------------|------------| | Mistral is better | 0.536 | 0.496 | 0.192 | 0.200 | | Mistral is slightly better | 0.196 | 0.184 | 0.164 | 0.204 | | Ties | 0.052 | 0.060 | 0.236 | 0.160 | | Other is slightly better | 0.060 | 0.088 | 0.112 | 0.124 | | Other is better | 0.156 | 0.172 | 0.296 | 0.312 | **Note**: - We conducted side by side evaluations with an external third-party vendor, on a set of over 1k proprietary coding and generalist prompts. - Evaluators were tasked with selecting their preferred model response from anonymized generations produced by Mistral Small 3 vs another model. - We are aware that in some cases the benchmarks on human judgement starkly differ from publicly available benchmarks, but have taken extra caution in verifying a fair evaluation. We are confident that the above benchmarks are valid. ### Publicly accesible benchmarks **Reasoning & Knowledge** | Evaluation | mistral-small-24B-instruct-2501 | gemma-2b-27b | llama-3.3-70b | qwen2.5-32b | gpt-4o-mini-2024-07-18 | |------------|---------------|--------------|---------------|---------------|-------------| | mmlu_pro_5shot_cot_instruct | 0.663 | 0.536 | 0.666 | 0.683 | 0.617 | | gpqa_main_cot_5shot_instruct | 0.453 | 0.344 | 0.531 | 0.404 | 0.377 | **Math & Coding** | Evaluation | mistral-small-24B-instruct-2501 | gemma-2b-27b | llama-3.3-70b | qwen2.5-32b | gpt-4o-mini-2024-07-18 | |------------|---------------|--------------|---------------|---------------|-------------| | humaneval_instruct_pass@1 | 0.848 | 0.732 | 0.854 | 0.909 | 0.890 | | math_instruct | 0.706 | 0.535 | 0.743 | 0.819 | 0.761 | **Instruction following** | Evaluation | mistral-small-24B-instruct-2501 | gemma-2b-27b | llama-3.3-70b | qwen2.5-32b | gpt-4o-mini-2024-07-18 | |------------|---------------|--------------|---------------|---------------|-------------| | mtbench_dev | 8.35 | 7.86 | 7.96 | 8.26 | 8.33 | | wildbench | 52.27 | 48.21 | 50.04 | 52.73 | 56.13 | | arena_hard | 0.873 | 0.788 | 0.840 | 0.860 | 0.897 | | ifeval | 0.829 | 0.8065 | 0.8835 | 0.8401 | 0.8499 | **Note**: - Performance accuracy on all benchmarks were obtained through the same internal evaluation pipeline - as such, numbers may vary slightly from previously reported performance (Qwen2.5-32B-Instruct, Llama-3.3-70B-Instruct, Gemma-2-27B-IT). - Judge based evals such as Wildbench, Arena hard and MTBench were based on gpt-4o-2024-05-13. ### Basic Instruct Template (V7-Tekken) *, and are placeholders.* ***Please make sure to use mistral-common as the source of truth*** ## Usage The model can be used with the following frameworks; - []( See here - []( See here ### vLLM We recommend using this model with the vLLM library to implement production-ready inference pipelines. **Note 1**: We recommond using a relatively low temperature, such as . **Note 2**: Make sure to add a system prompt to the model to best tailer it for your needs. If you want to use the model as a general assistant, we recommend the following system prompt: **_Installation_** Make sure you install []( Also make sure you have []( installed: You can also make use of a ready-to-go docker image or on the docker hub. #### Server We recommand that you use Mistral-Small-24B-Instruct-2501 in a server/client setting. 1. Spin up a server: **Note:** Running Mistral-Small-24B-Instruct-2501 on GPU requires ~55 GB of GPU RAM in bf16 or fp16. 2. To ping the client you can use a simple Python snippet. # /\\_/\\ # ( o.o ) # > ^ < # ### Function calling Mistral-Small-24-Instruct-2501 is excellent at function / tool calling tasks via vLLM. *E.g.:* Example #### Offline # /\\_/\\ # ( o.o ) # > ^ < # ### Transformers If you want to use Hugging Face transformers to generate text, you can do something like this.",
+ "model_explanation_gemini": "A 24B-parameter instruction-tuned multilingual model optimized for fast conversational agents, low-latency function calling, and local inference with advanced reasoning and 32k context window.\n\n**Features:** \n- Multilingual support (English, French, Chinese, etc.) \n- Agent-centric (function calling, JSON output) \n- 32k context window \n- Apache 2.0 licensed \n- Optimized for local deployment (fits RTX 4090/MacBook 32"
+}
\ No newline at end of file
diff --git a/model_data_json/unsloth_Qwen2.5-3B-Instruct-unsloth-bnb-4bit.json b/model_data_json/unsloth_Qwen2.5-3B-Instruct-unsloth-bnb-4bit.json
new file mode 100644
index 0000000000000000000000000000000000000000..841be10267d5548908d74895cd56e57d15f6466f
--- /dev/null
+++ b/model_data_json/unsloth_Qwen2.5-3B-Instruct-unsloth-bnb-4bit.json
@@ -0,0 +1,26 @@
+{
+ "model_id": "unsloth/Qwen2.5-3B-Instruct-unsloth-bnb-4bit",
+ "downloads": 75387,
+ "tags": [
+ "transformers",
+ "safetensors",
+ "qwen2",
+ "text-generation",
+ "unsloth",
+ "qwen",
+ "conversational",
+ "en",
+ "arxiv:2407.10671",
+ "base_model:Qwen/Qwen2.5-3B-Instruct",
+ "base_model:quantized:Qwen/Qwen2.5-3B-Instruct",
+ "license:apache-2.0",
+ "autotrain_compatible",
+ "text-generation-inference",
+ "endpoints_compatible",
+ "4-bit",
+ "bitsandbytes",
+ "region:us"
+ ],
+ "description": "--- base_model: Qwen/Qwen2.5-3B-Instruct language: - en library_name: transformers license: apache-2.0 tags: - unsloth - transformers - qwen ---
Finetune LLMs 2-5x faster with 70% less memory via Unsloth!
We have a free Google Colab Tesla T4 notebook for Qwen2.5 (7B) here: ## ✨ Finetune for Free All notebooks are **beginner friendly**! Add your dataset, click \"Run All\", and you'll get a 2x faster finetuned model which can be exported to GGUF, vLLM or uploaded to Hugging Face. | Unsloth supports | Free Notebooks | Performance | Memory use | |-----------------|--------------------------------------------------------------------------------------------------------------------------|-------------|----------| | **Llama-3.2 (3B)** | ▶️ Start on Colab-Conversational.ipynb) | 2.4x faster | 58% less | | **Llama-3.2 (11B vision)** | ▶️ Start on Colab-Vision.ipynb) | 2x faster | 60% less | | **Qwen2 VL (7B)** | ▶️ Start on Colab-Vision.ipynb) | 1.8x faster | 60% less | | **Qwen2.5 (7B)** | ▶️ Start on Colab-Alpaca.ipynb) | 2x faster | 60% less | | **Llama-3.1 (8B)** | ▶️ Start on Colab-Alpaca.ipynb) | 2.4x faster | 58% less | | **Phi-3.5 (mini)** | ▶️ Start on Colab | 2x faster | 50% less | | **Gemma 2 (9B)** | ▶️ Start on Colab-Alpaca.ipynb) | 2.4x faster | 58% less | | **Mistral (7B)** | ▶️ Start on Colab-Conversational.ipynb) | 2.2x faster | 62% less | - This Llama 3.2 conversational notebook-Conversational.ipynb) is useful for ShareGPT ChatML / Vicuna templates. - This text completion notebook-Text_Completion.ipynb) is for raw text. This DPO notebook replicates Zephyr. - \\* Kaggle has 2x T4s, but we use 1. Due to overhead, 1x T4 is 5x faster. # Qwen2.5 ## Introduction Qwen2.5 is the latest series of Qwen large language models. For Qwen2.5, we release a number of base language models and instruction-tuned language models ranging from 0.5 to 72 billion parameters. Qwen2.5 brings the following improvements upon Qwen2: - Significantly **more knowledge** and has greatly improved capabilities in **coding** and **mathematics**, thanks to our specialized expert models in these domains. - Significant improvements in **instruction following**, **generating long texts** (over 8K tokens), **understanding structured data** (e.g, tables), and **generating structured outputs** especially JSON. **More resilient to the diversity of system prompts**, enhancing role-play implementation and condition-setting for chatbots. - **Long-context Support** up to 128K tokens and can generate up to 8K tokens. - **Multilingual support** for over 29 languages, including Chinese, English, French, Spanish, Portuguese, German, Italian, Russian, Japanese, Korean, Vietnamese, Thai, Arabic, and more. **This repo contains the base 0.5B Qwen2.5 model**, which has the following features: - Type: Causal Language Models - Training Stage: Pretraining - Architecture: transformers with RoPE, SwiGLU, RMSNorm, Attention QKV bias and tied word embeddings - Number of Parameters: 0.49B - Number of Paramaters (Non-Embedding): 0.36B - Number of Layers: 24 - Number of Attention Heads (GQA): 14 for Q and 2 for KV - Context Length: Full 32,768 tokens **We do not recommend using base language models for conversations.** Instead, you can apply post-training, e.g., SFT, RLHF, continued pretraining, etc., on this model. For more details, please refer to our blog, GitHub, and Documentation. ## Requirements The code of Qwen2.5 has been in the latest Hugging face and we advise you to use the latest version of . With , you will encounter the following error: ## Evaluation & Performance Detailed evaluation results are reported in this 📑 blog. For requirements on GPU memory and the respective throughput, see results here. ## Citation If you find our work helpful, feel free to give us a cite.",
+ "model_explanation_gemini": "A 4-bit quantized version of the Qwen2.5-3B-Instruct model optimized by Unsloth for faster fine-tuning and reduced memory usage while improving accuracy over standard 4-bit quantization. \n\n**Features**: \n- Based on Qwen2.5-3B-Instruct, a 3B-parameter instruction-tuned LLM. \n- Uses Unsloth’s selective 4-bit quantization for higher accuracy. \n- Enables 2-5x"
+}
\ No newline at end of file
diff --git a/model_data_json/unsloth_Qwen3-32B-GGUF.json b/model_data_json/unsloth_Qwen3-32B-GGUF.json
new file mode 100644
index 0000000000000000000000000000000000000000..a21fd927d3d8c3ab3079a5d0bb09175752bc86f4
--- /dev/null
+++ b/model_data_json/unsloth_Qwen3-32B-GGUF.json
@@ -0,0 +1,24 @@
+{
+ "model_id": "unsloth/Qwen3-32B-GGUF",
+ "downloads": 71036,
+ "tags": [
+ "transformers",
+ "gguf",
+ "qwen3",
+ "text-generation",
+ "qwen",
+ "unsloth",
+ "en",
+ "arxiv:2309.00071",
+ "base_model:Qwen/Qwen3-32B",
+ "base_model:quantized:Qwen/Qwen3-32B",
+ "license:apache-2.0",
+ "autotrain_compatible",
+ "endpoints_compatible",
+ "region:us",
+ "imatrix",
+ "conversational"
+ ],
+ "description": "--- base_model: Qwen/Qwen3-32B language: - en library_name: transformers license_link: license: apache-2.0 tags: - qwen3 - qwen - unsloth - transformers ---
- Fine-tune Qwen3 (14B) for free using our Google Colab notebook here! - Read our Blog about Qwen3 support: unsloth.ai/blog/qwen3 - View the rest of our notebooks in our docs here. - Run & export your fine-tuned model to Ollama, llama.cpp or HF. | Unsloth supports | Free Notebooks | Performance | Memory use | |-----------------|--------------------------------------------------------------------------------------------------------------------------|-------------|----------| | **Qwen3 (14B)** | ▶️ Start on Colab | 3x faster | 70% less | | **GRPO with Qwen3 (8B)** | ▶️ Start on Colab | 3x faster | 80% less | | **Llama-3.2 (3B)** | ▶️ Start on Colab-Conversational.ipynb) | 2.4x faster | 58% less | | **Llama-3.2 (11B vision)** | ▶️ Start on Colab-Vision.ipynb) | 2x faster | 60% less | | **Qwen2.5 (7B)** | ▶️ Start on Colab-Alpaca.ipynb) | 2x faster | 60% less | | **Phi-4 (14B)** | ▶️ Start on Colab | 2x faster | 50% less | # To Switch Between Thinking and Non-Thinking If you are using llama.cpp, Ollama, Open WebUI etc., you can add and to user prompts or system messages to switch the model's thinking mode from turn to turn. The model will follow the most recent instruction in multi-turn conversations. Here is an example of multi-turn conversation: # Qwen3-32B ## Qwen3 Highlights Qwen3 is the latest generation of large language models in Qwen series, offering a comprehensive suite of dense and mixture-of-experts (MoE) models. Built upon extensive training, Qwen3 delivers groundbreaking advancements in reasoning, instruction-following, agent capabilities, and multilingual support, with the following key features: - **Uniquely support of seamless switching between thinking mode** (for complex logical reasoning, math, and coding) and **non-thinking mode** (for efficient, general-purpose dialogue) **within single model**, ensuring optimal performance across various scenarios. - **Significantly enhancement in its reasoning capabilities**, surpassing previous QwQ (in thinking mode) and Qwen2.5 instruct models (in non-thinking mode) on mathematics, code generation, and commonsense logical reasoning. - **Superior human preference alignment**, excelling in creative writing, role-playing, multi-turn dialogues, and instruction following, to deliver a more natural, engaging, and immersive conversational experience. - **Expertise in agent capabilities**, enabling precise integration with external tools in both thinking and unthinking modes and achieving leading performance among open-source models in complex agent-based tasks. - **Support of 100+ languages and dialects** with strong capabilities for **multilingual instruction following** and **translation**. ## Model Overview **Qwen3-32B** has the following features: - Type: Causal Language Models - Training Stage: Pretraining & Post-training - Number of Parameters: 32.8B - Number of Paramaters (Non-Embedding): 31.2B - Number of Layers: 64 - Number of Attention Heads (GQA): 64 for Q and 8 for KV - Context Length: 32,768 natively and 131,072 tokens with YaRN. For more details, including benchmark evaluation, hardware requirements, and inference performance, please refer to our blog, GitHub, and Documentation. ## Quickstart The code of Qwen3 has been in the latest Hugging Face and we advise you to use the latest version of . With , you will encounter the following error: The following contains a code snippet illustrating how to use the model generate content based on given inputs. For deployment, you can use or to create an OpenAI-compatible API endpoint: - vLLM: - SGLang: ## Switching Between Thinking and Non-Thinking Mode > [!TIP] > The switch is also available in APIs created by vLLM and SGLang. > Please refer to our documentation for more details. ### By default, Qwen3 has thinking capabilities enabled, similar to QwQ-32B. This means the model will use its reasoning abilities to enhance the quality of generated responses. For example, when explicitly setting or leaving it as the default value in , the model will engage its thinking mode. In this mode, the model will generate think content wrapped in a block, followed by the final response. > [!NOTE] > For thinking mode, use , , , and (the default setting in ). **DO NOT use greedy decoding**, as it can lead to performance degradation and endless repetitions. For more detailed guidance, please refer to the Best Practices section. ### We provide a hard switch to strictly disable the model's thinking behavior, aligning its functionality with the previous Qwen2.5-Instruct models. This mode is particularly useful in scenarios where disabling thinking is essential for enhancing efficiency. In this mode, the model will not generate any think content and will not include a block. > [!NOTE] > For non-thinking mode, we suggest using , , , and . For more detailed guidance, please refer to the Best Practices section. ### Advanced Usage: Switching Between Thinking and Non-Thinking Modes via User Input We provide a soft switch mechanism that allows users to dynamically control the model's behavior when . Specifically, you can add and to user prompts or system messages to switch the model's thinking mode from turn to turn. The model will follow the most recent instruction in multi-turn conversations. Here is an example of a multi-turn conversation: > **Note** > For API compatibility, when , regardless of whether the user uses or , the model will always output a block wrapped in . However, the content inside this block may be empty if thinking is disabled. > When , the soft switches are not valid. Regardless of any or tags input by the user, the model will not generate think content and will not include a block. ## Agentic Use Qwen3 excels in tool calling capabilities. We recommend using Qwen-Agent to make the best use of agentic ability of Qwen3. Qwen-Agent encapsulates tool-calling templates and tool-calling parsers internally, greatly reducing coding complexity. To define the available tools, you can use the MCP configuration file, use the integrated tool of Qwen-Agent, or integrate other tools by yourself. ## Processing Long Texts Qwen3 natively supports context lengths of up to 32,768 tokens. For conversations where the total length (including both input and output) significantly exceeds this limit, we recommend using RoPE scaling techniques to handle long texts effectively. We have validated the model's performance on context lengths of up to 131,072 tokens using the YaRN method. YaRN is currently supported by several inference frameworks, e.g., and for local use, and for deployment. In general, there are two approaches to enabling YaRN for supported frameworks: - Modifying the model files: In the file, add the fields: For , you need to regenerate the GGUF file after the modification. - Passing command line arguments: For , you can use For , you can use For from , you can use > [!IMPORTANT] > If you encounter the following warning > > please upgrade . > [!NOTE] > All the notable open-source frameworks implement static YaRN, which means the scaling factor remains constant regardless of input length, **potentially impacting performance on shorter texts.** > We advise adding the configuration only when processing long contexts is required. > It is also recommended to modify the as needed. For example, if the typical context length for your application is 65,536 tokens, it would be better to set as 2.0. > [!NOTE] > The default in is set to 40,960. This allocation includes reserving 32,768 tokens for outputs and 8,192 tokens for typical prompts, which is sufficient for most scenarios involving short text processing. If the average context length does not exceed 32,768 tokens, we do not recommend enabling YaRN in this scenario, as it may potentially degrade model performance. > [!TIP] > The endpoint provided by Alibaba Model Studio supports dynamic YaRN by default and no extra configuration is needed. ## Best Practices To achieve optimal performance, we recommend the following settings: 1. **Sampling Parameters**: - For thinking mode (), use , , , and . **DO NOT use greedy decoding**, as it can lead to performance degradation and endless repetitions. - For non-thinking mode (), we suggest using , , , and . - For supported frameworks, you can adjust the parameter between 0 and 2 to reduce endless repetitions. However, using a higher value may occasionally result in language mixing and a slight decrease in model performance. 2. **Adequate Output Length**: We recommend using an output length of 32,768 tokens for most queries. For benchmarking on highly complex problems, such as those found in math and programming competitions, we suggest setting the max output length to 38,912 tokens. This provides the model with sufficient space to generate detailed and comprehensive responses, thereby enhancing its overall performance. 3. **Standardize Output Format**: We recommend using prompts to standardize model outputs when benchmarking. - **Math Problems**: Include \"Please reason step by step, and put your final answer within \\boxed{}.\" in the prompt. - **Multiple-Choice Questions**: Add the following JSON structure to the prompt to standardize responses: \"Please show your choice in the field with only the choice letter, e.g., .\" 4. **No Thinking Content in History**: In multi-turn conversations, the historical model output should only include the final output part and does not need to include the thinking content. It is implemented in the provided chat template in Jinja2. However, for frameworks that do not directly use the Jinja2 chat template, it is up to the developers to ensure that the best practice is followed. ### Citation If you find our work helpful, feel free to give us a cite.",
+ "model_explanation_gemini": "A 32.8B-parameter causal language model optimized for reasoning, multilingual tasks, and agent capabilities, featuring seamless switching between thinking (complex reasoning) and non-thinking (general dialogue) modes. \n\n**Features:** \n- Supports 100+ languages \n- 32K native context length (extendable to 131K with YaRN) \n- Enhanced reasoning, math, and coding in \"thinking mode\" \n- Efficient general dialogue in \"non-thinking mode\" \n- Superior alignment"
+}
\ No newline at end of file
diff --git a/model_data_json/unsloth_llama-2-7b-bnb-4bit.json b/model_data_json/unsloth_llama-2-7b-bnb-4bit.json
new file mode 100644
index 0000000000000000000000000000000000000000..02c359a57d96280a53c6b4fa3fe11619423e8aee
--- /dev/null
+++ b/model_data_json/unsloth_llama-2-7b-bnb-4bit.json
@@ -0,0 +1,23 @@
+{
+ "model_id": "unsloth/llama-2-7b-bnb-4bit",
+ "downloads": 70876,
+ "tags": [
+ "transformers",
+ "safetensors",
+ "llama",
+ "text-generation",
+ "unsloth",
+ "llama2",
+ "llama-2",
+ "en",
+ "license:apache-2.0",
+ "autotrain_compatible",
+ "text-generation-inference",
+ "endpoints_compatible",
+ "4-bit",
+ "bitsandbytes",
+ "region:us"
+ ],
+ "description": "--- language: - en library_name: transformers license: apache-2.0 tags: - unsloth - transformers - llama - llama2 - llama-2 --- # Finetune Mistral, Gemma, Llama 2-5x faster with 70% less memory via Unsloth! Directly quantized 4bit model with . We have a Google Colab Tesla T4 notebook for Llama 7b here: ## ✨ Finetune for Free All notebooks are **beginner friendly**! Add your dataset, click \"Run All\", and you'll get a 2x faster finetuned model which can be exported to GGUF, vLLM or uploaded to Hugging Face. | Unsloth supports | Free Notebooks | Performance | Memory use | |-----------------|--------------------------------------------------------------------------------------------------------------------------|-------------|----------| | **Gemma 7b** | ▶️ Start on Colab | 2.4x faster | 58% less | | **Mistral 7b** | ▶️ Start on Colab | 2.2x faster | 62% less | | **Llama-2 7b** | ▶️ Start on Colab | 2.2x faster | 43% less | | **TinyLlama** | ▶️ Start on Colab | 3.9x faster | 74% less | | **CodeLlama 34b** A100 | ▶️ Start on Colab | 1.9x faster | 27% less | | **Mistral 7b** 1xT4 | ▶️ Start on Kaggle | 5x faster\\* | 62% less | | **DPO - Zephyr** | ▶️ Start on Colab | 1.9x faster | 19% less | - This conversational notebook is useful for ShareGPT ChatML / Vicuna templates. - This text completion notebook is for raw text. This DPO notebook replicates Zephyr. - \\* Kaggle has 2x T4s, but we use 1. Due to overhead, 1x T4 is 5x faster.",
+ "model_explanation_gemini": "A 4-bit quantized Llama-2-7B model optimized for faster fine-tuning (2.2x speed) and reduced memory usage (43% less) using Unsloth, with beginner-friendly Colab notebooks for deployment. \n\n**Features:** \n- 4-bit quantization for efficiency \n- 2.2x faster fine-tuning than standard Llama-2-7B \n- 43% lower memory consumption \n- Supports export to GGUF, vLLM,"
+}
\ No newline at end of file
diff --git a/model_data_json/unsloth_llama-3-8b-bnb-4bit.json b/model_data_json/unsloth_llama-3-8b-bnb-4bit.json
new file mode 100644
index 0000000000000000000000000000000000000000..cc2fa42282c77e42a7441ec4be2644521e6d725b
--- /dev/null
+++ b/model_data_json/unsloth_llama-3-8b-bnb-4bit.json
@@ -0,0 +1,26 @@
+{
+ "model_id": "unsloth/llama-3-8b-bnb-4bit",
+ "downloads": 76826,
+ "tags": [
+ "transformers",
+ "safetensors",
+ "llama",
+ "text-generation",
+ "llama-3",
+ "meta",
+ "facebook",
+ "unsloth",
+ "en",
+ "base_model:meta-llama/Meta-Llama-3-8B",
+ "base_model:quantized:meta-llama/Meta-Llama-3-8B",
+ "license:llama3",
+ "autotrain_compatible",
+ "text-generation-inference",
+ "endpoints_compatible",
+ "4-bit",
+ "bitsandbytes",
+ "region:us"
+ ],
+ "description": "--- language: - en library_name: transformers license: llama3 tags: - llama-3 - llama - meta - facebook - unsloth - transformers base_model: - meta-llama/Meta-Llama-3-8B --- # Finetune Llama 3.2, Gemma 2, Mistral 2-5x faster with 70% less memory via Unsloth! We have a free Google Colab Tesla T4 notebook for Llama 3.1 (8B) here: # Finetune Llama 3.3, Gemma 2, Mistral 2-5x faster with 70% less memory via Unsloth! We have a free Google Colab Tesla T4 notebook for Llama 3.1 (8B) here: # unsloth/Llama-3-8B-bnb-4bit For more details on the model, please go to Meta's original model card ## ✨ Finetune for Free All notebooks are **beginner friendly**! Add your dataset, click \"Run All\", and you'll get a 2x faster finetuned model which can be exported to GGUF, vLLM or uploaded to Hugging Face. | Unsloth supports | Free Notebooks | Performance | Memory use | |-----------------|--------------------------------------------------------------------------------------------------------------------------|-------------|----------| | **Llama-3.2 (3B)** | ▶️ Start on Colab-Conversational.ipynb) | 2.4x faster | 58% less | | **Llama-3.2 (11B vision)** | ▶️ Start on Colab-Vision.ipynb) | 2x faster | 60% less | | **Qwen2 VL (7B)** | ▶️ Start on Colab-Vision.ipynb) | 1.8x faster | 60% less | | **Qwen2.5 (7B)** | ▶️ Start on Colab-Alpaca.ipynb) | 2x faster | 60% less | | **Llama-3.1 (8B)** | ▶️ Start on Colab-Alpaca.ipynb) | 2.4x faster | 58% less | | **Phi-3.5 (mini)** | ▶️ Start on Colab | 2x faster | 50% less | | **Gemma 2 (9B)** | ▶️ Start on Colab-Alpaca.ipynb) | 2.4x faster | 58% less | | **Mistral (7B)** | ▶️ Start on Colab-Conversational.ipynb) | 2.2x faster | 62% less | - This Llama 3.2 conversational notebook-Conversational.ipynb) is useful for ShareGPT ChatML / Vicuna templates. - This text completion notebook-Text_Completion.ipynb) is for raw text. This DPO notebook replicates Zephyr. - \\* Kaggle has 2x T4s, but we use 1. Due to overhead, 1x T4 is 5x faster. ## Special Thanks A huge thank you to the Meta and Llama team for creating and releasing these models. ## Model Details Meta developed and released the Meta Llama 3 family of large language models (LLMs), a collection of pretrained and instruction tuned generative text models in 8 and 70B sizes. The Llama 3 instruction tuned models are optimized for dialogue use cases and outperform many of the available open source chat models on common industry benchmarks. Further, in developing these models, we took great care to optimize helpfulness and safety. **Model developers** Meta **Variations** Llama 3 comes in two sizes — 8B and 70B parameters — in pre-trained and instruction tuned variants. **Input** Models input text only. **Output** Models generate text and code only. **Model Architecture** Llama 3 is an auto-regressive language model that uses an optimized transformer architecture. The tuned versions use supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) to align with human preferences for helpfulness and safety.
Training Data
Params
Context length
GQA
Token count
Knowledge cutoff
Llama 3
A new mix of publicly available online data.
8B
8k
Yes
15T+
March, 2023
70B
8k
Yes
December, 2023
**Llama 3 family of models**. Token counts refer to pretraining data only. Both the 8 and 70B versions use Grouped-Query Attention (GQA) for improved inference scalability. **Model Release Date** April 18, 2024. **Status** This is a static model trained on an offline dataset. Future versions of the tuned models will be released as we improve model safety with community feedback. **License** A custom commercial license is available at: Where to send questions or comments about the model Instructions on how to provide feedback or comments on the model can be found in the model README. For more technical information about generation parameters and recipes for how to use Llama 3 in applications, please go here. ## Intended Use **Intended Use Cases** Llama 3 is intended for commercial and research use in English. Instruction tuned models are intended for assistant-like chat, whereas pretrained models can be adapted for a variety of natural language generation tasks. **Out-of-scope** Use in any manner that violates applicable laws or regulations (including trade compliance laws). Use in any other way that is prohibited by the Acceptable Use Policy and Llama 3 Community License. Use in languages other than English**. **Note: Developers may fine-tune Llama 3 models for languages beyond English provided they comply with the Llama 3 Community License and the Acceptable Use Policy. ## How to use This repository contains two versions of Meta-Llama-3-70B-Instruct, for use with transformers and with the original codebase. ### Use with transformers See the snippet below for usage with Transformers: ### Use with Please, follow the instructions in the repository. To download Original checkpoints, see the example command below leveraging : For Hugging Face support, we recommend using transformers or TGI, but a similar command works. ## Hardware and Software **Training Factors** We used custom training libraries, Meta's Research SuperCluster, and production clusters for pretraining. Fine-tuning, annotation, and evaluation were also performed on third-party cloud compute. **Carbon Footprint Pretraining utilized a cumulative** 7.7M GPU hours of computation on hardware of type H100-80GB (TDP of 700W). Estimated total emissions were 2290 tCO2eq, 100% of which were offset by Meta’s sustainability program.
Time (GPU hours)
Power Consumption (W)
Carbon Emitted(tCO2eq)
Llama 3 8B
1.3M
700
390
Llama 3 70B
6.4M
700
1900
Total
7.7M
2290
**CO2 emissions during pre-training**. Time: total GPU time required for training each model. Power Consumption: peak power capacity per GPU device for the GPUs used adjusted for power usage efficiency. 100% of the emissions are directly offset by Meta's sustainability program, and because we are openly releasing these models, the pretraining costs do not need to be incurred by others. ## Training Data **Overview** Llama 3 was pretrained on over 15 trillion tokens of data from publicly available sources. The fine-tuning data includes publicly available instruction datasets, as well as over 10M human-annotated examples. Neither the pretraining nor the fine-tuning datasets include Meta user data. **Data Freshness** The pretraining data has a cutoff of March 2023 for the 7B and December 2023 for the 70B models respectively. ## Benchmarks In this section, we report the results for Llama 3 models on standard automatic benchmarks. For all the evaluations, we use our internal evaluations library. For details on the methodology see here. ### Base pretrained models
Category
Benchmark
Llama 3 8B
Llama2 7B
Llama2 13B
Llama 3 70B
Llama2 70B
General
MMLU (5-shot)
66.6
45.7
53.8
79.5
69.7
AGIEval English (3-5 shot)
45.9
28.8
38.7
63.0
54.8
CommonSenseQA (7-shot)
72.6
57.6
67.6
83.8
78.7
Winogrande (5-shot)
76.1
73.3
75.4
83.1
81.8
BIG-Bench Hard (3-shot, CoT)
61.1
38.1
47.0
81.3
65.7
ARC-Challenge (25-shot)
78.6
53.7
67.6
93.0
85.3
Knowledge reasoning
TriviaQA-Wiki (5-shot)
78.5
72.1
79.6
89.7
87.5
Reading comprehension
SQuAD (1-shot)
76.4
72.2
72.1
85.6
82.6
QuAC (1-shot, F1)
44.4
39.6
44.9
51.1
49.4
BoolQ (0-shot)
75.7
65.5
66.9
79.0
73.1
DROP (3-shot, F1)
58.4
37.9
49.8
79.7
70.2
### Instruction tuned models
Benchmark
Llama 3 8B
Llama 2 7B
Llama 2 13B
Llama 3 70B
Llama 2 70B
MMLU (5-shot)
68.4
34.1
47.8
82.0
52.9
GPQA (0-shot)
34.2
21.7
22.3
39.5
21.0
HumanEval (0-shot)
62.2
7.9
14.0
81.7
25.6
GSM-8K (8-shot, CoT)
79.6
25.7
77.4
93.0
57.5
MATH (4-shot, CoT)
30.0
3.8
6.7
50.4
11.6
### Responsibility & Safety We believe that an open approach to AI leads to better, safer products, faster innovation, and a bigger overall market. We are committed to Responsible AI development and took a series of steps to limit misuse and harm and support the open source community. Foundation models are widely capable technologies that are built to be used for a diverse range of applications. They are not designed to meet every developer preference on safety levels for all use cases, out-of-the-box, as those by their nature will differ across different applications. Rather, responsible LLM-application deployment is achieved by implementing a series of safety best practices throughout the development of such applications, from the model pre-training, fine-tuning and the deployment of systems composed of safeguards to tailor the safety needs specifically to the use case and audience. As part of the Llama 3 release, we updated our Responsible Use Guide to outline the steps and best practices for developers to implement model and system level safety for their application. We also provide a set of resources including Meta Llama Guard 2 and Code Shield safeguards. These tools have proven to drastically reduce residual risks of LLM Systems, while maintaining a high level of helpfulness. We encourage developers to tune and deploy these safeguards according to their needs and we provide a reference implementation to get you started. #### Llama 3-Instruct As outlined in the Responsible Use Guide, some trade-off between model helpfulness and model alignment is likely unavoidable. Developers should exercise discretion about how to weigh the benefits of alignment and helpfulness for their specific use case and audience. Developers should be mindful of residual risks when using Llama models and leverage additional safety tools as needed to reach the right safety bar for their use case. Safety For our instruction tuned model, we conducted extensive red teaming exercises, performed adversarial evaluations and implemented safety mitigations techniques to lower residual risks. As with any Large Language Model, residual risks will likely remain and we recommend that developers assess these risks in the context of their use case. In parallel, we are working with the community to make AI safety benchmark standards transparent, rigorous and interpretable. Refusals In addition to residual risks, we put a great emphasis on model refusals to benign prompts. Over-refusing not only can impact the user experience but could even be harmful in certain contexts as well. We’ve heard the feedback from the developer community and improved our fine tuning to ensure that Llama 3 is significantly less likely to falsely refuse to answer prompts than Llama 2. We built internal benchmarks and developed mitigations to limit false refusals making Llama 3 our most helpful model to date. #### Responsible release In addition to responsible use considerations outlined above, we followed a rigorous process that requires us to take extra measures against misuse and critical risks before we make our release decision. Misuse If you access or use Llama 3, you agree to the Acceptable Use Policy. The most recent copy of this policy can be found at #### Critical risks CBRNE (Chemical, Biological, Radiological, Nuclear, and high yield Explosives) We have conducted a two fold assessment of the safety of the model in this area: * Iterative testing during model training to assess the safety of responses related to CBRNE threats and other adversarial risks. * Involving external CBRNE experts to conduct an uplift test assessing the ability of the model to accurately provide expert knowledge and reduce barriers to potential CBRNE misuse, by reference to what can be achieved using web search (without the model). ### Cyber Security We have evaluated Llama 3 with CyberSecEval, Meta’s cybersecurity safety eval suite, measuring Llama 3’s propensity to suggest insecure code when used as a coding assistant, and Llama 3’s propensity to comply with requests to help carry out cyber attacks, where attacks are defined by the industry standard MITRE ATT&CK cyber attack ontology. On our insecure coding and cyber attacker helpfulness tests, Llama 3 behaved in the same range or safer than models of equivalent coding capability. ### Child Safety Child Safety risk assessments were conducted using a team of experts, to assess the model’s capability to produce outputs that could result in Child Safety risks and inform on any necessary and appropriate risk mitigations via fine tuning. We leveraged those expert red teaming sessions to expand the coverage of our evaluation benchmarks through Llama 3 model development. For Llama 3, we conducted new in-depth sessions using objective based methodologies to assess the model risks along multiple attack vectors. We also partnered with content specialists to perform red teaming exercises assessing potentially violating content while taking account of market specific nuances or experiences. ### Community Generative AI safety requires expertise and tooling, and we believe in the strength of the open community to accelerate its progress. We are active members of open consortiums, including the AI Alliance, Partnership in AI and MLCommons, actively contributing to safety standardization and transparency. We encourage the community to adopt taxonomies like the MLCommons Proof of Concept evaluation to facilitate collaboration and transparency on safety and content evaluations. Our Purple Llama tools are open sourced for the community to use and widely distributed across ecosystem partners including cloud service providers. We encourage community contributions to our Github repository. Finally, we put in place a set of resources including an output reporting mechanism and bug bounty program to continuously improve the Llama technology with the help of the community. ## Ethical Considerations and Limitations The core values of Llama 3 are openness, inclusivity and helpfulness. It is meant to serve everyone, and to work for a wide range of use cases. It is thus designed to be accessible to people across many different backgrounds, experiences and perspectives. Llama 3 addresses users and their needs as they are, without insertion unnecessary judgment or normativity, while reflecting the understanding that even content that may appear problematic in some cases can serve valuable purposes in others. It respects the dignity and autonomy of all users, especially in terms of the values of free thought and expression that power innovation and progress. But Llama 3 is a new technology, and like any new technology, there are risks associated with its use. Testing conducted to date has been in English, and has not covered, nor could it cover, all scenarios. For these reasons, as with all LLMs, Llama 3’s potential outputs cannot be predicted in advance, and the model may in some instances produce inaccurate, biased or other objectionable responses to user prompts. Therefore, before deploying any applications of Llama 3 models, developers should perform safety testing and tuning tailored to their specific applications of the model. As outlined in the Responsible Use Guide, we recommend incorporating Purple Llama solutions into your workflows and specifically Llama Guard which provides a base model to filter input and output prompts to layer system-level safety on top of model-level safety. Please see the Responsible Use Guide available at ## Citation instructions @article{llama3modelcard, title={Llama 3 Model Card}, author={AI@Meta}, year={2024}, url = { } ## Contributors Aaditya Singh; Aaron Grattafiori; Abhimanyu Dubey; Abhinav Jauhri; Abhinav Pandey; Abhishek Kadian; Adam Kelsey; Adi Gangidi; Ahmad Al-Dahle; Ahuva Goldstand; Aiesha Letman; Ajay Menon; Akhil Mathur; Alan Schelten; Alex Vaughan; Amy Yang; Andrei Lupu; Andres Alvarado; Andrew Gallagher; Andrew Gu; Andrew Ho; Andrew Poulton; Andrew Ryan; Angela Fan; Ankit Ramchandani; Anthony Hartshorn; Archi Mitra; Archie Sravankumar; Artem Korenev; Arun Rao; Ashley Gabriel; Ashwin Bharambe; Assaf Eisenman; Aston Zhang; Aurelien Rodriguez; Austen Gregerson; Ava Spataru; Baptiste Roziere; Ben Maurer; Benjamin Leonhardi; Bernie Huang; Bhargavi Paranjape; Bing Liu; Binh Tang; Bobbie Chern; Brani Stojkovic; Brian Fuller; Catalina Mejia Arenas; Chao Zhou; Charlotte Caucheteux; Chaya Nayak; Ching-Hsiang Chu; Chloe Bi; Chris Cai; Chris Cox; Chris Marra; Chris McConnell; Christian Keller; Christoph Feichtenhofer; Christophe Touret; Chunyang Wu; Corinne Wong; Cristian Canton Ferrer; Damien Allonsius; Daniel Kreymer; Daniel Haziza; Daniel Li; Danielle Pintz; Danny Livshits; Danny Wyatt; David Adkins; David Esiobu; David Xu; Davide Testuggine; Delia David; Devi Parikh; Dhruv Choudhary; Dhruv Mahajan; Diana Liskovich; Diego Garcia-Olano; Diego Perino; Dieuwke Hupkes; Dingkang Wang; Dustin Holland; Egor Lakomkin; Elina Lobanova; Xiaoqing Ellen Tan; Emily Dinan; Eric Smith; Erik Brinkman; Esteban Arcaute; Filip Radenovic; Firat Ozgenel; Francesco Caggioni; Frank Seide; Frank Zhang; Gabriel Synnaeve; Gabriella Schwarz; Gabrielle Lee; Gada Badeer; Georgia Anderson; Graeme Nail; Gregoire Mialon; Guan Pang; Guillem Cucurell; Hailey Nguyen; Hannah Korevaar; Hannah Wang; Haroun Habeeb; Harrison Rudolph; Henry Aspegren; Hu Xu; Hugo Touvron; Iga Kozlowska; Igor Molybog; Igor Tufanov; Iliyan Zarov; Imanol Arrieta Ibarra; Irina-Elena Veliche; Isabel Kloumann; Ishan Misra; Ivan Evtimov; Jacob Xu; Jade Copet; Jake Weissman; Jan Geffert; Jana Vranes; Japhet Asher; Jason Park; Jay Mahadeokar; Jean-Baptiste Gaya; Jeet Shah; Jelmer van der Linde; Jennifer Chan; Jenny Hong; Jenya Lee; Jeremy Fu; Jeremy Teboul; Jianfeng Chi; Jianyu Huang; Jie Wang; Jiecao Yu; Joanna Bitton; Joe Spisak; Joelle Pineau; Jon Carvill; Jongsoo Park; Joseph Rocca; Joshua Johnstun; Junteng Jia; Kalyan Vasuden Alwala; Kam Hou U; Kate Plawiak; Kartikeya Upasani; Kaushik Veeraraghavan; Ke Li; Kenneth Heafield; Kevin Stone; Khalid El-Arini; Krithika Iyer; Kshitiz Malik; Kuenley Chiu; Kunal Bhalla; Kyle Huang; Lakshya Garg; Lauren Rantala-Yeary; Laurens van der Maaten; Lawrence Chen; Leandro Silva; Lee Bell; Lei Zhang; Liang Tan; Louis Martin; Lovish Madaan; Luca Wehrstedt; Lukas Blecher; Luke de Oliveira; Madeline Muzzi; Madian Khabsa; Manav Avlani; Mannat Singh; Manohar Paluri; Mark Zuckerberg; Marcin Kardas; Martynas Mankus; Mathew Oldham; Mathieu Rita; Matthew Lennie; Maya Pavlova; Meghan Keneally; Melanie Kambadur; Mihir Patel; Mikayel Samvelyan; Mike Clark; Mike Lewis; Min Si; Mitesh Kumar Singh; Mo Metanat; Mona Hassan; Naman Goyal; Narjes Torabi; Nicolas Usunier; Nikolay Bashlykov; Nikolay Bogoychev; Niladri Chatterji; Ning Dong; Oliver Aobo Yang; Olivier Duchenne; Onur Celebi; Parth Parekh; Patrick Alrassy; Paul Saab; Pavan Balaji; Pedro Rittner; Pengchuan Zhang; Pengwei Li; Petar Vasic; Peter Weng; Polina Zvyagina; Prajjwal Bhargava; Pratik Dubal; Praveen Krishnan; Punit Singh Koura; Qing He; Rachel Rodriguez; Ragavan Srinivasan; Rahul Mitra; Ramon Calderer; Raymond Li; Robert Stojnic; Roberta Raileanu; Robin Battey; Rocky Wang; Rohit Girdhar; Rohit Patel; Romain Sauvestre; Ronnie Polidoro; Roshan Sumbaly; Ross Taylor; Ruan Silva; Rui Hou; Rui Wang; Russ Howes; Ruty Rinott; Saghar Hosseini; Sai Jayesh Bondu; Samyak Datta; Sanjay Singh; Sara Chugh; Sargun Dhillon; Satadru Pan; Sean Bell; Sergey Edunov; Shaoliang Nie; Sharan Narang; Sharath Raparthy; Shaun Lindsay; Sheng Feng; Sheng Shen; Shenghao Lin; Shiva Shankar; Shruti Bhosale; Shun Zhang; Simon Vandenhende; Sinong Wang; Seohyun Sonia Kim; Soumya Batra; Sten Sootla; Steve Kehoe; Suchin Gururangan; Sumit Gupta; Sunny Virk; Sydney Borodinsky; Tamar Glaser; Tamar Herman; Tamara Best; Tara Fowler; Thomas Georgiou; Thomas Scialom; Tianhe Li; Todor Mihaylov; Tong Xiao; Ujjwal Karn; Vedanuj Goswami; Vibhor Gupta; Vignesh Ramanathan; Viktor Kerkez; Vinay Satish Kumar; Vincent Gonguet; Vish Vogeti; Vlad Poenaru; Vlad Tiberiu Mihailescu; Vladan Petrovic; Vladimir Ivanov; Wei Li; Weiwei Chu; Wenhan Xiong; Wenyin Fu; Wes Bouaziz; Whitney Meers; Will Constable; Xavier Martinet; Xiaojian Wu; Xinbo Gao; Xinfeng Xie; Xuchao Jia; Yaelle Goldschlag; Yann LeCun; Yashesh Gaur; Yasmine Babaei; Ye Qi; Yenda Li; Yi Wen; Yiwen Song; Youngjin Nam; Yuchen Hao; Yuchen Zhang; Yun Wang; Yuning Mao; Yuzi He; Zacharie Delpierre Coudert; Zachary DeVito; Zahra Hankir; Zhaoduo Wen; Zheng Yan; Zhengxing Chen; Zhenyu Yang; Zoe Papakipos",
+ "model_explanation_gemini": "\"Optimizes Llama-3-8B for faster fine-tuning (2-5x speed) and reduced memory usage (70% less) via Unsloth, supporting conversational and text-completion tasks.\"\n\n**Features**: \n- 4-bit quantization (bnb-4bit) \n- 2-5x faster fine-tuning than standard methods \n- 70% lower memory consumption \n- Supports GGUF/vLLM export and HF uploads \n- Beginner-friendly Colab notebooks"
+}
\ No newline at end of file
diff --git a/model_data_json/unslothai_4.json b/model_data_json/unslothai_4.json
new file mode 100644
index 0000000000000000000000000000000000000000..d9b9a47419392e681edba3042da400f30271fcee
--- /dev/null
+++ b/model_data_json/unslothai_4.json
@@ -0,0 +1,15 @@
+{
+ "model_id": "unslothai/4",
+ "downloads": 71368,
+ "tags": [
+ "transformers",
+ "safetensors",
+ "llama",
+ "feature-extraction",
+ "text-generation-inference",
+ "endpoints_compatible",
+ "region:us"
+ ],
+ "description": "--- library_name: transformers tags: [] ---",
+ "model_explanation_gemini": "The provided description for the model 'unslothai_4' does not contain any information about its functionality, purpose, features, or comparisons with other models. \n\nFeatures: None specified. \nComparative Explanation: Not applicable."
+}
\ No newline at end of file
diff --git a/model_data_json/unslothai_azure.json b/model_data_json/unslothai_azure.json
new file mode 100644
index 0000000000000000000000000000000000000000..763de657b20f362a25f8e07c59b580aed353f301
--- /dev/null
+++ b/model_data_json/unslothai_azure.json
@@ -0,0 +1,15 @@
+{
+ "model_id": "unslothai/azure",
+ "downloads": 79899,
+ "tags": [
+ "transformers",
+ "safetensors",
+ "llama",
+ "feature-extraction",
+ "text-generation-inference",
+ "endpoints_compatible",
+ "region:us"
+ ],
+ "description": "--- {} --- We log statistics to see if any envs are breaking",
+ "model_explanation_gemini": "Logs statistics to monitor environment stability.\n\nFeatures: \n- Tracks and logs environment statistics. \n- Detects potential environment issues or breaks. \n\nComparison: Not applicable (no comparative details provided in the description)."
+}
\ No newline at end of file
diff --git a/model_data_json/unslothai_vram-96.json b/model_data_json/unslothai_vram-96.json
new file mode 100644
index 0000000000000000000000000000000000000000..c1ee11b94edaaabf7c61e6b477b285c5f4792227
--- /dev/null
+++ b/model_data_json/unslothai_vram-96.json
@@ -0,0 +1,16 @@
+{
+ "model_id": "unslothai/vram-96",
+ "downloads": 71977,
+ "tags": [
+ "transformers",
+ "safetensors",
+ "llama",
+ "feature-extraction",
+ "arxiv:1910.09700",
+ "text-generation-inference",
+ "endpoints_compatible",
+ "region:us"
+ ],
+ "description": "--- library_name: transformers tags: [] --- # Model Card for Model ID ## Model Details ### Model Description This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated. - **Developed by:** [More Information Needed] - **Funded by [optional]:** [More Information Needed] - **Shared by [optional]:** [More Information Needed] - **Model type:** [More Information Needed] - **Language(s) (NLP):** [More Information Needed] - **License:** [More Information Needed] - **Finetuned from model [optional]:** [More Information Needed] ### Model Sources [optional] - **Repository:** [More Information Needed] - **Paper [optional]:** [More Information Needed] - **Demo [optional]:** [More Information Needed] ## Uses ### Direct Use [More Information Needed] ### Downstream Use [optional] [More Information Needed] ### Out-of-Scope Use [More Information Needed] ## Bias, Risks, and Limitations [More Information Needed] ### Recommendations Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. ## How to Get Started with the Model Use the code below to get started with the model. [More Information Needed] ## Training Details ### Training Data [More Information Needed] ### Training Procedure #### Preprocessing [optional] [More Information Needed] #### Training Hyperparameters - **Training regime:** [More Information Needed] #### Speeds, Sizes, Times [optional] [More Information Needed] ## Evaluation ### Testing Data, Factors & Metrics #### Testing Data [More Information Needed] #### Factors [More Information Needed] #### Metrics [More Information Needed] ### Results [More Information Needed] #### Summary ## Model Examination [optional] [More Information Needed] ## Environmental Impact Carbon emissions can be estimated using the Machine Learning Impact calculator presented in Lacoste et al. (2019). - **Hardware Type:** [More Information Needed] - **Hours used:** [More Information Needed] - **Cloud Provider:** [More Information Needed] - **Compute Region:** [More Information Needed] - **Carbon Emitted:** [More Information Needed] ## Technical Specifications [optional] ### Model Architecture and Objective [More Information Needed] ### Compute Infrastructure [More Information Needed] #### Hardware [More Information Needed] #### Software [More Information Needed] ## Citation [optional] **BibTeX:** [More Information Needed] **APA:** [More Information Needed] ## Glossary [optional] [More Information Needed] ## More Information [optional] [More Information Needed] ## Model Card Authors [optional] [More Information Needed] ## Model Card Contact [More Information Needed]",
+ "model_explanation_gemini": "The model description for 'unslothai_vram-96' provides insufficient information to determine its core functionality, purpose, or features, as key details like model type, task, and training data are marked \"[More Information Needed].\"\n\nFeatures: \n- No discernible features are listed due to incomplete documentation. \n\nComparison: \n- No comparative analysis is possible as the description lacks details about the model's performance, architecture, or benchmarks against other models."
+}
\ No newline at end of file
diff --git a/model_data_json/vblagoje_bert-english-uncased-finetuned-pos.json b/model_data_json/vblagoje_bert-english-uncased-finetuned-pos.json
new file mode 100644
index 0000000000000000000000000000000000000000..c8e9b0f41cc56e8c905b65301a8862dda4adc4aa
--- /dev/null
+++ b/model_data_json/vblagoje_bert-english-uncased-finetuned-pos.json
@@ -0,0 +1,17 @@
+{
+ "model_id": "vblagoje/bert-english-uncased-finetuned-pos",
+ "downloads": 77652,
+ "tags": [
+ "transformers",
+ "pytorch",
+ "jax",
+ "safetensors",
+ "bert",
+ "token-classification",
+ "autotrain_compatible",
+ "endpoints_compatible",
+ "region:us"
+ ],
+ "description": "## Part-of-Speech (PoS) Tags Below are the Part-of-Speech (PoS) tags used in the model: | **Tag** | **Meaning** | **Examples** | |-----------|------------------------------------------------------|--------------------------------| | ADP | Adposition (prepositions or postpositions) | in, on, by | | ADJ | Adjective | significant, global | | ADV | Adverb | quickly, often | | AUX | Auxiliary verb | is, was | | CCONJ | Coordinating conjunction | and, but | | DET | Determiner | the, a | | INTJ | Interjection | oh, wow | | NOUN | Noun | man, city | | NUM | Number | one, 2022 | | PART | Particle | 's, to | | PRON | Pronoun | he, which | | PROPN | Proper noun | Neil Armstrong, Paris | | PUNCT | Punctuation mark | ,, . | | SCONJ | Subordinating conjunction | because, although | | SYM | Symbol | $, % | | VERB | Verb | run, is | | X | Other (generally words that do not fit into other categories) | [not defined] |",
+ "model_explanation_gemini": "Finetuned for part-of-speech tagging, this BERT-based model identifies and classifies words into grammatical categories like nouns, verbs, adjectives, and more. \n\n**Model Features:** \n- Uncased English BERT finetuned for PoS tagging \n- Supports standard PoS tags (e.g., NOUN, VERB, ADJ, ADV, PUNCT) \n- Handles diverse word categories, including symbols (SYM) and interjections (INTJ)"
+}
\ No newline at end of file
diff --git a/model_data_json/vidore_colpali-v1.3.json b/model_data_json/vidore_colpali-v1.3.json
new file mode 100644
index 0000000000000000000000000000000000000000..f1271e21b6ad35176d4756f79076cc8c83249586
--- /dev/null
+++ b/model_data_json/vidore_colpali-v1.3.json
@@ -0,0 +1,22 @@
+{
+ "model_id": "vidore/colpali-v1.3",
+ "downloads": 81111,
+ "tags": [
+ "colpali",
+ "safetensors",
+ "vidore",
+ "vidore-experimental",
+ "visual-document-retrieval",
+ "en",
+ "dataset:vidore/colpali_train_set",
+ "arxiv:2004.12832",
+ "arxiv:2407.01449",
+ "arxiv:2106.09685",
+ "base_model:vidore/colpaligemma-3b-pt-448-base",
+ "base_model:finetune:vidore/colpaligemma-3b-pt-448-base",
+ "license:mit",
+ "region:us"
+ ],
+ "description": "--- license: mit library_name: colpali base_model: vidore/colpaligemma-3b-pt-448-base language: - en tags: - vidore - vidore-experimental datasets: - vidore/colpali_train_set pipeline_tag: visual-document-retrieval --- # ColPali: Visual Retriever based on PaliGemma-3B with ColBERT strategy ## This version is trained with 256 batch size for 3 epochs on the same data as the original ColPali model. ColPali is a model based on a novel model architecture and training strategy based on Vision Language Models (VLMs) to efficiently index documents from their visual features. It is a PaliGemma-3B extension that generates ColBERT- style multi-vector representations of text and images. It was introduced in the paper ColPali: Efficient Document Retrieval with Vision Language Models and first released in this repository