A newer version of the Gradio SDK is available:
5.28.0
Norwegian RAG Chatbot Project Structure
Overview
This document outlines the project structure for our lightweight Norwegian RAG chatbot implementation that uses Hugging Face's Inference API instead of running models locally.
Directory Structure
chatbot_project/
βββ design/ # Design documents
β βββ rag_architecture.md
β βββ document_processing.md
β βββ chat_interface.md
βββ research/ # Research findings
β βββ norwegian_llm_research.md
βββ src/ # Source code
β βββ api/ # API integration
β β βββ __init__.py
β β βββ huggingface_api.py # HF Inference API integration
β β βββ config.py # API configuration
β βββ document_processing/ # Document processing
β β βββ __init__.py
β β βββ extractor.py # Text extraction from documents
β β βββ chunker.py # Text chunking
β β βββ processor.py # Main document processor
β βββ rag/ # RAG implementation
β β βββ __init__.py
β β βββ retriever.py # Document retrieval
β β βββ generator.py # Response generation
β βββ web/ # Web interface
β β βββ __init__.py
β β βββ app.py # Gradio app
β β βββ embed.py # Embedding functionality
β βββ utils/ # Utilities
β β βββ __init__.py
β β βββ helpers.py # Helper functions
β βββ main.py # Main application entry point
βββ data/ # Data storage
β βββ documents/ # Original documents
β βββ processed/ # Processed documents and embeddings
βββ tests/ # Tests
β βββ test_api.py
β βββ test_document_processing.py
β βββ test_rag.py
βββ venv/ # Virtual environment
βββ requirements-ultra-light.txt # Lightweight dependencies
βββ requirements.txt # Original requirements (for reference)
βββ README.md # Project documentation
Key Components
1. API Integration (src/api/
)
huggingface_api.py
: Integration with Hugging Face Inference API for both LLM and embedding modelsconfig.py
: Configuration for API endpoints, model IDs, and API keys
2. Document Processing (src/document_processing/
)
extractor.py
: Extract text from various document formatschunker.py
: Split documents into manageable chunksprocessor.py
: Orchestrate the document processing pipeline
3. RAG Implementation (src/rag/
)
retriever.py
: Retrieve relevant document chunks based on querygenerator.py
: Generate responses using retrieved context
4. Web Interface (src/web/
)
app.py
: Gradio web interface for the chatbotembed.py
: Generate embedding code for website integration
5. Main Application (src/main.py
)
- Entry point for the application
- Orchestrates the different components
Implementation Approach
- Remote Model Execution: Use Hugging Face's Inference API for both LLM and embedding models
- Lightweight Document Processing: Process documents locally but use remote APIs for embedding generation
- Simple Vector Storage: Store embeddings in simple file-based format rather than dedicated vector database
- Gradio Interface: Create a simple but effective chat interface using Gradio
- Hugging Face Spaces Deployment: Deploy the final solution to Hugging Face Spaces