Spaces:
Sleeping
Sleeping
title: RagBenchCapstone10 | |
emoji: π | |
colorFrom: green | |
colorTo: yellow | |
sdk: gradio | |
sdk_version: 5.16.0 | |
app_file: app.py | |
pinned: false | |
short_description: RagBench Dataset development by Saiteja | |
Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference | |
# RAG Benchmark Evaluation System | |
## Overview | |
This project implements a Retrieval-Augmented Generation (RAG) system for evaluating different language models and reranking strategies. It provides a user-friendly interface for querying documents and analyzing the performance of various models. | |
## Features | |
- Multiple LLM support (LLaMA 3.3, Mistral 7B) | |
- Various reranking models: | |
- MS MARCO MiniLM | |
- MS MARCO TinyBERT | |
- MonoT5 Base | |
- MonoT5 Small | |
- MonoT5 3B | |
- Vector similarity search using Milvus | |
- Automatic document chunking and retrieval | |
- Performance metrics calculation | |
- Interactive Gradio interface | |
## Prerequisites | |
- Python 3.8+ | |
- CUDA-compatible GPU (optional, for faster processing) | |
## Installation | |
1. Clone the repository: | |
bash | |
git clone https://github.com/yourusername/rag-benchmark.git | |
cd rag-benchmark | |
2. Install dependencies: | |
- pip install -r requirements.txt | |
3. Configure the models: | |
- Create a `models` directory and add your language model files. | |
- Create a `rerankers` directory and add your reranking model files. | |
- Run the application: | |
- python app.py | |
## Usage | |
1. Start the application: | |
2. Access the web interface at `http://localhost:7860` | |
3. Enter your question and select: | |
- LLM Model (LLaMA 3.3 or Mistral 7B) | |
- Reranking Model (MS MARCO or MonoT5 variants) | |
4. Click "Evaluate Model" to get results | |
## Metrics | |
The system calculates several performance metrics: | |
- RMSE Context Relevance | |
- RMSE Context Utilization | |
- AUCROC Adherence | |
- Processing Time | |
## Reranking Models Comparison | |
### MS MARCO Models | |
- **MiniLM**: Fast and efficient, good general performance | |
- **TinyBERT**: Lightweight, slightly lower accuracy but faster | |
### MonoT5 Models | |
- **Small**: Compact and fast, suitable for limited resources | |
- **Base**: Balanced performance and speed | |
- **3B**: Highest accuracy, requires more computational resources | |
## Error Handling | |
- Automatic fallback to fewer documents if token limits are exceeded | |
- Graceful handling of API timeouts | |
- Comprehensive error logging | |
## Contributing | |
1. Fork the repository | |
2. Create your feature branch (`git checkout -b feature/AmazingFeature`) | |
3. Commit your changes (`git commit -m 'Add some AmazingFeature'`) | |
4. Push to the branch (`git push origin feature/AmazingFeature`) | |
5. Open a Pull Request | |
## Dependencies | |
- gradio | |
- torch | |
- transformers | |
- sentence-transformers | |
- pymilvus | |
- numpy | |
- pandas | |
- scikit-learn | |
- tiktoken | |
- groq | |
- huggingface_hub | |
## License | |
[Your License Here] | |
## Acknowledgments | |
- RAGBench dataset | |
- Hugging Face Transformers | |
- Milvus Vector Database | |
- Groq API | |