Spaces:

aidevhund
/

chatbotQA

Sleeping

App Files Files Community

aidevhund commited on Jan 11

Commit

6850f57

verified ·

1 Parent(s): 7c3e962

Update markdowm.py

Browse files

Files changed (1) hide show

markdowm.py +21 -21

markdowm.py CHANGED Viewed

@@ -17,7 +17,7 @@ With QueryVault Chatbot, you can interactively query your document, receive cont
 ## 🚀 **Steps to Use the HundAI QueryVault Chatbot**
 1. **Upload Your File**
-   Begin by uploading a document. Supported formats include `.pdf`, `.docx`, `.txt`, `.csv`, `.xlsx`, `.pptx`, `.html`, `.jpg`, `.png`, and more.
 2. **Select Embedding Model**
    Choose an embedding model to parse and index the document’s contents, then submit. Wait for the confirmation message that the document has been successfully indexed.
@@ -39,30 +39,30 @@ Upon uploading a document, the bot utilizes **LlamaParse** to parse its content.
 ## 🔍 **Available LLMs and Embedding Models**
 ### **Embedding Models** (For indexing document content)
-1. **`BAAI/bge-large-en`**
    - **Size**: 335M parameters
    - **Best For**: Complex, detailed embeddings; slower but yields high accuracy.
-2. **`BAAI/bge-small-en-v1.5`**
    - **Size**: 33.4M parameters
    - **Best For**: Faster embeddings, ideal for lighter workloads and quick responses.
-3. **`NeuML/pubmedbert-base-embeddings`**
    - **Size**: 768-dimensional dense vector space
    - **Best For**: Biomedical or medical-related text; highly specialized.
-4. **`BAAI/llm-embedder`**
    - **Size**: 109M parameters
    - **Best For**: Basic embeddings for straightforward use cases.
 ### **LLMs** (For generating answers)
-1. **`Mixtral-8x7B-Instruct`**
    - **Size**: 46.7B parameters
    - **Purpose**: Demonstrates compelling performance with minimal fine-tuning. Suited for unmoderated or exploratory use.
-2. **`Meta-Llama-3-8B-Instruct`**
    - **Size**: 8.03B parameters
    - **Purpose**: Optimized for dialogue, emphasizing safety and helpfulness. Excellent for structured, instructive responses.
-3. **`Mistral-7B`**
    - **Size**: 7.24B parameters
    - **Purpose**: Fine-tuned for effectiveness; lacks moderation, useful for quick demonstration purposes.
-4. **`HundAI`**
    - **Size**: 7.22B parameters
    - **Purpose**: Robust fine-tuned model for inference, leveraging large-scale data for highly contextual responses.
@@ -74,18 +74,18 @@ The choice of embedding models plays a crucial role in determining the speed and
 | **Scenario**                  | **Embedding Model**                  | **Strengths**                                      | **Trade-Offs**                       |
 |:-----------------------------:|:------------------------------------:|:--------------------------------------------------:|:------------------------------------:|
-| **Fastest Response**          | `BAAI/bge-small-en-v1.5`            | Speed-oriented, ideal for high-frequency querying  | May miss nuanced details             |
-| **High Accuracy for Large Texts** | `BAAI/bge-large-en`               | High accuracy, captures complex document structure | Slower response time                |
-| **Balanced General Purpose**  | `BAAI/llm-embedder` | Reliable, quick response, adaptable across topics | Moderate accuracy, general use case  |
-| **Biomedical & Specialized Text** | `NeuML/pubmedbert-base-embeddings` | Optimized for medical and scientific text          | Specialized, slightly slower         |
 ---
 ## 📂 **Supported File Formats**
 The bot supports a range of document formats, making it versatile for various data sources. Below are the currently supported formats:
-- **Documents**: `.pdf`, `.docx`, `.doc`, `.txt`, `.csv`, `.xlsx`, `.pptx`, `.html`
-- **Images**: `.jpg`, `.jpeg`, `.png`, `.webp`, `.svg`
 ---
@@ -114,17 +114,17 @@ guide = '''
 | **Embedding Model**         | **Speed (Vector Index)** | **Advantages**                      | **Trade-Offs**                  |
 |-----------------------------|-------------------|-------------------------------------|---------------------------------|
-| `BAAI/bge-small-en-v1.5`    | **Fastest**       | Ideal for quick indexing            | May miss nuanced details        |
-| `BAAI/llm-embedder`         | **Fast**          | Balanced performance and detail     | Slightly less precise than large models |
-| `BAAI/bge-large-en`         | **Slow**          | Best overall precision and detail   | Slower due to complexity        |
 ### Language Models (LLMs) and Use Cases
 | **LLM**                             | **Best Use Case**                       |
 |------------------------------------|-----------------------------------------|
-| `Mixtral-8x7B-Instruct-v0.1` | Works well for **both short and long answers** |
-| `Meta-Llama-3-8B-Instruct`  | Ideal for **long-length answers**         |
-| `HundAI`           | Best suited for **short-length answers**  |
 '''

 ## 🚀 **Steps to Use the HundAI QueryVault Chatbot**
 1. **Upload Your File**
+   Begin by uploading a document. Supported formats include .pdf, .docx, .txt, .csv, .xlsx, .pptx, .html, .jpg, .png, and more.
 2. **Select Embedding Model**
    Choose an embedding model to parse and index the document’s contents, then submit. Wait for the confirmation message that the document has been successfully indexed.
 ## 🔍 **Available LLMs and Embedding Models**
 ### **Embedding Models** (For indexing document content)
+1. **BAAI/bge-large-en**
    - **Size**: 335M parameters
    - **Best For**: Complex, detailed embeddings; slower but yields high accuracy.
+2. **BAAI/bge-small-en-v1.5**
    - **Size**: 33.4M parameters
    - **Best For**: Faster embeddings, ideal for lighter workloads and quick responses.
+3. **NeuML/pubmedbert-base-embeddings**
    - **Size**: 768-dimensional dense vector space
    - **Best For**: Biomedical or medical-related text; highly specialized.
+4. **BAAI/llm-embedder**
    - **Size**: 109M parameters
    - **Best For**: Basic embeddings for straightforward use cases.
 ### **LLMs** (For generating answers)
+1. **Mixtral-8x7B-Instruct**
    - **Size**: 46.7B parameters
    - **Purpose**: Demonstrates compelling performance with minimal fine-tuning. Suited for unmoderated or exploratory use.
+2. **Meta-Llama-3-8B-Instruct**
    - **Size**: 8.03B parameters
    - **Purpose**: Optimized for dialogue, emphasizing safety and helpfulness. Excellent for structured, instructive responses.
+3. **Mistral-7B**
    - **Size**: 7.24B parameters
    - **Purpose**: Fine-tuned for effectiveness; lacks moderation, useful for quick demonstration purposes.
+4. **HundAI-7B-S**
    - **Size**: 7.22B parameters
    - **Purpose**: Robust fine-tuned model for inference, leveraging large-scale data for highly contextual responses.
 | **Scenario**                  | **Embedding Model**                  | **Strengths**                                      | **Trade-Offs**                       |
 |:-----------------------------:|:------------------------------------:|:--------------------------------------------------:|:------------------------------------:|
+| **Fastest Response**          | BAAI/bge-small-en-v1.5            | Speed-oriented, ideal for high-frequency querying  | May miss nuanced details             |
+| **High Accuracy for Large Texts** | BAAI/bge-large-en               | High accuracy, captures complex document structure | Slower response time                |
+| **Balanced General Purpose**  | BAAI/llm-embedder | Reliable, quick response, adaptable across topics | Moderate accuracy, general use case  |
+| **Biomedical & Specialized Text** | NeuML/pubmedbert-base-embeddings | Optimized for medical and scientific text          | Specialized, slightly slower         |
 ---
 ## 📂 **Supported File Formats**
 The bot supports a range of document formats, making it versatile for various data sources. Below are the currently supported formats:
+- **Documents**: .pdf, .docx, .doc, .txt, .csv, .xlsx, .pptx, .html
+- **Images**: .jpg, .jpeg, .png, .webp, .svg
 ---
 | **Embedding Model**         | **Speed (Vector Index)** | **Advantages**                      | **Trade-Offs**                  |
 |-----------------------------|-------------------|-------------------------------------|---------------------------------|
+| BAAI/bge-small-en-v1.5    | **Fastest**       | Ideal for quick indexing            | May miss nuanced details        |
+| BAAI/llm-embedder         | **Fast**          | Balanced performance and detail     | Slightly less precise than large models |
+| BAAI/bge-large-en         | **Slow**          | Best overall precision and detail   | Slower due to complexity        |
 ### Language Models (LLMs) and Use Cases
 | **LLM**                             | **Best Use Case**                       |
 |------------------------------------|-----------------------------------------|
+| Mixtral-8x7B-Instruct-v0.1 | Works well for **both short and long answers** |
+| Meta-Llama-3-8B-Instruct  | Ideal for **long-length answers**         |
+| HundAI-7B-S           | Best suited for **short-length answers**  |
 '''