iver / src /project_structure.md
hevold's picture
Upload 29 files
b34efa5 verified
# Norwegian RAG Chatbot Project Structure
## Overview
This document outlines the project structure for our lightweight Norwegian RAG chatbot implementation that uses Hugging Face's Inference API instead of running models locally.
## Directory Structure
```
chatbot_project/
β”œβ”€β”€ design/ # Design documents
β”‚ β”œβ”€β”€ rag_architecture.md
β”‚ β”œβ”€β”€ document_processing.md
β”‚ └── chat_interface.md
β”œβ”€β”€ research/ # Research findings
β”‚ └── norwegian_llm_research.md
β”œβ”€β”€ src/ # Source code
β”‚ β”œβ”€β”€ api/ # API integration
β”‚ β”‚ β”œβ”€β”€ __init__.py
β”‚ β”‚ β”œβ”€β”€ huggingface_api.py # HF Inference API integration
β”‚ β”‚ └── config.py # API configuration
β”‚ β”œβ”€β”€ document_processing/ # Document processing
β”‚ β”‚ β”œβ”€β”€ __init__.py
β”‚ β”‚ β”œβ”€β”€ extractor.py # Text extraction from documents
β”‚ β”‚ β”œβ”€β”€ chunker.py # Text chunking
β”‚ β”‚ └── processor.py # Main document processor
β”‚ β”œβ”€β”€ rag/ # RAG implementation
β”‚ β”‚ β”œβ”€β”€ __init__.py
β”‚ β”‚ β”œβ”€β”€ retriever.py # Document retrieval
β”‚ β”‚ └── generator.py # Response generation
β”‚ β”œβ”€β”€ web/ # Web interface
β”‚ β”‚ β”œβ”€β”€ __init__.py
β”‚ β”‚ β”œβ”€β”€ app.py # Gradio app
β”‚ β”‚ └── embed.py # Embedding functionality
β”‚ β”œβ”€β”€ utils/ # Utilities
β”‚ β”‚ β”œβ”€β”€ __init__.py
β”‚ β”‚ └── helpers.py # Helper functions
β”‚ └── main.py # Main application entry point
β”œβ”€β”€ data/ # Data storage
β”‚ β”œβ”€β”€ documents/ # Original documents
β”‚ └── processed/ # Processed documents and embeddings
β”œβ”€β”€ tests/ # Tests
β”‚ β”œβ”€β”€ test_api.py
β”‚ β”œβ”€β”€ test_document_processing.py
β”‚ └── test_rag.py
β”œβ”€β”€ venv/ # Virtual environment
β”œβ”€β”€ requirements-ultra-light.txt # Lightweight dependencies
β”œβ”€β”€ requirements.txt # Original requirements (for reference)
└── README.md # Project documentation
```
## Key Components
### 1. API Integration (`src/api/`)
- `huggingface_api.py`: Integration with Hugging Face Inference API for both LLM and embedding models
- `config.py`: Configuration for API endpoints, model IDs, and API keys
### 2. Document Processing (`src/document_processing/`)
- `extractor.py`: Extract text from various document formats
- `chunker.py`: Split documents into manageable chunks
- `processor.py`: Orchestrate the document processing pipeline
### 3. RAG Implementation (`src/rag/`)
- `retriever.py`: Retrieve relevant document chunks based on query
- `generator.py`: Generate responses using retrieved context
### 4. Web Interface (`src/web/`)
- `app.py`: Gradio web interface for the chatbot
- `embed.py`: Generate embedding code for website integration
### 5. Main Application (`src/main.py`)
- Entry point for the application
- Orchestrates the different components
## Implementation Approach
1. **Remote Model Execution**: Use Hugging Face's Inference API for both LLM and embedding models
2. **Lightweight Document Processing**: Process documents locally but use remote APIs for embedding generation
3. **Simple Vector Storage**: Store embeddings in simple file-based format rather than dedicated vector database
4. **Gradio Interface**: Create a simple but effective chat interface using Gradio
5. **Hugging Face Spaces Deployment**: Deploy the final solution to Hugging Face Spaces