|
# Norwegian RAG Chatbot Project Structure |
|
|
|
## Overview |
|
This document outlines the project structure for our lightweight Norwegian RAG chatbot implementation that uses Hugging Face's Inference API instead of running models locally. |
|
|
|
## Directory Structure |
|
``` |
|
chatbot_project/ |
|
βββ design/ # Design documents |
|
β βββ rag_architecture.md |
|
β βββ document_processing.md |
|
β βββ chat_interface.md |
|
βββ research/ # Research findings |
|
β βββ norwegian_llm_research.md |
|
βββ src/ # Source code |
|
β βββ api/ # API integration |
|
β β βββ __init__.py |
|
β β βββ huggingface_api.py # HF Inference API integration |
|
β β βββ config.py # API configuration |
|
β βββ document_processing/ # Document processing |
|
β β βββ __init__.py |
|
β β βββ extractor.py # Text extraction from documents |
|
β β βββ chunker.py # Text chunking |
|
β β βββ processor.py # Main document processor |
|
β βββ rag/ # RAG implementation |
|
β β βββ __init__.py |
|
β β βββ retriever.py # Document retrieval |
|
β β βββ generator.py # Response generation |
|
β βββ web/ # Web interface |
|
β β βββ __init__.py |
|
β β βββ app.py # Gradio app |
|
β β βββ embed.py # Embedding functionality |
|
β βββ utils/ # Utilities |
|
β β βββ __init__.py |
|
β β βββ helpers.py # Helper functions |
|
β βββ main.py # Main application entry point |
|
βββ data/ # Data storage |
|
β βββ documents/ # Original documents |
|
β βββ processed/ # Processed documents and embeddings |
|
βββ tests/ # Tests |
|
β βββ test_api.py |
|
β βββ test_document_processing.py |
|
β βββ test_rag.py |
|
βββ venv/ # Virtual environment |
|
βββ requirements-ultra-light.txt # Lightweight dependencies |
|
βββ requirements.txt # Original requirements (for reference) |
|
βββ README.md # Project documentation |
|
``` |
|
|
|
## Key Components |
|
|
|
### 1. API Integration (`src/api/`) |
|
- `huggingface_api.py`: Integration with Hugging Face Inference API for both LLM and embedding models |
|
- `config.py`: Configuration for API endpoints, model IDs, and API keys |
|
|
|
### 2. Document Processing (`src/document_processing/`) |
|
- `extractor.py`: Extract text from various document formats |
|
- `chunker.py`: Split documents into manageable chunks |
|
- `processor.py`: Orchestrate the document processing pipeline |
|
|
|
### 3. RAG Implementation (`src/rag/`) |
|
- `retriever.py`: Retrieve relevant document chunks based on query |
|
- `generator.py`: Generate responses using retrieved context |
|
|
|
### 4. Web Interface (`src/web/`) |
|
- `app.py`: Gradio web interface for the chatbot |
|
- `embed.py`: Generate embedding code for website integration |
|
|
|
### 5. Main Application (`src/main.py`) |
|
- Entry point for the application |
|
- Orchestrates the different components |
|
|
|
## Implementation Approach |
|
|
|
1. **Remote Model Execution**: Use Hugging Face's Inference API for both LLM and embedding models |
|
2. **Lightweight Document Processing**: Process documents locally but use remote APIs for embedding generation |
|
3. **Simple Vector Storage**: Store embeddings in simple file-based format rather than dedicated vector database |
|
4. **Gradio Interface**: Create a simple but effective chat interface using Gradio |
|
5. **Hugging Face Spaces Deployment**: Deploy the final solution to Hugging Face Spaces |
|
|