Spaces:

ssaiteja16
/

RagBenchCapstone10

Sleeping

File size: 1,913 Bytes

fed116a
923b896
 
fed116a
923b896
 
 
 
 
fed116a
923b896
 
 
 
 
fed116a
923b896
 
 
 
 
fed116a
923b896
 
 
 
 
fed116a
 
 
923b896
fed116a
 
923b896
fed116a
 
 
 
923b896
 
fed116a
923b896
 
 
 
 
 
fed116a
923b896
 
 
 
 
 
 
 
 
 
 
fed116a
 
923b896
fed116a
923b896
 
 
 
 
fed116a

# Performance Analysis Report

1. **Retrieval Time**:

   - Milvus + LLaMA: 0.132s
   - Weaviate + Mistral: 0.157s
   - Milvus + Mistral: NaN

2. **Context Relevance** (higher is better):

   - Milvus + LLaMA: 0.640
   - Weaviate + Mistral: 0.591
   - Milvus + Mistral: 0.518

3. **Context Utilization** (higher is better):

   - Milvus + LLaMA: 0.673
   - Weaviate + Mistral: 0.619
   - Milvus + Mistral: 0.614

4. **AUCROC** (Area Under ROC Curve):

   - Milvus + LLaMA: 0.912
   - Weaviate + Mistral: 0.750
   - Milvus + Mistral: 0.844

5. **RMSE** (Root Mean Square Error):
   - Milvus + LLaMA:
     - Context Relevance RMSE: 0.179
     - Context Utilization RMSE: 0.302
   - Weaviate + Mistral:
     - Context Relevance RMSE: 0.414
     - Context Utilization RMSE: 0.482
   - Milvus + Mistral:
     - Context Relevance RMSE: 0.167
     - Context Utilization RMSE: 0.258

## Analysis

1. **Best Overall Performance: Milvus + LLaMA**

   - Highest AUCROC score (0.912)
   - Best context relevance (0.640) and utilization (0.673)
   - Fast retrieval time (0.132s)
   - Moderate RMSE scores

2. **Runner-up: Milvus + Mistral**

   - Second-best AUCROC (0.844)
   - Lowest RMSE scores overall
   - Lower context relevance and utilization
   - Retrieval time data unavailable

3. **Third Place: Weaviate + Mistral**
   - Lowest AUCROC (0.750)
   - Highest RMSE scores
   - Slowest retrieval time (0.157s)
   - Moderate context metrics

## Recommendation

Based on the comprehensive analysis of all metrics, Milvus + LLaMA emerges as the optimal choice for overall performance. It demonstrates:

- Superior accuracy (highest AUCROC)
- Better context handling capabilities
- Efficient retrieval speed
- Reasonable error rates

However, if minimizing error (RMSE) is the primary objective, Milvus + Mistral could be a viable alternative due to its lower error rates in both context relevance and utilization metrics.