File size: 3,721 Bytes
b34efa5
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
# Norwegian RAG Chatbot Project Structure

## Overview
This document outlines the project structure for our lightweight Norwegian RAG chatbot implementation that uses Hugging Face's Inference API instead of running models locally.

## Directory Structure
```
chatbot_project/
β”œβ”€β”€ design/                  # Design documents
β”‚   β”œβ”€β”€ rag_architecture.md
β”‚   β”œβ”€β”€ document_processing.md
β”‚   └── chat_interface.md
β”œβ”€β”€ research/                # Research findings
β”‚   └── norwegian_llm_research.md
β”œβ”€β”€ src/                     # Source code
β”‚   β”œβ”€β”€ api/                 # API integration
β”‚   β”‚   β”œβ”€β”€ __init__.py
β”‚   β”‚   β”œβ”€β”€ huggingface_api.py  # HF Inference API integration
β”‚   β”‚   └── config.py        # API configuration
β”‚   β”œβ”€β”€ document_processing/ # Document processing
β”‚   β”‚   β”œβ”€β”€ __init__.py
β”‚   β”‚   β”œβ”€β”€ extractor.py     # Text extraction from documents
β”‚   β”‚   β”œβ”€β”€ chunker.py       # Text chunking
β”‚   β”‚   └── processor.py     # Main document processor
β”‚   β”œβ”€β”€ rag/                 # RAG implementation
β”‚   β”‚   β”œβ”€β”€ __init__.py
β”‚   β”‚   β”œβ”€β”€ retriever.py     # Document retrieval
β”‚   β”‚   └── generator.py     # Response generation
β”‚   β”œβ”€β”€ web/                 # Web interface
β”‚   β”‚   β”œβ”€β”€ __init__.py
β”‚   β”‚   β”œβ”€β”€ app.py           # Gradio app
β”‚   β”‚   └── embed.py         # Embedding functionality
β”‚   β”œβ”€β”€ utils/               # Utilities
β”‚   β”‚   β”œβ”€β”€ __init__.py
β”‚   β”‚   └── helpers.py       # Helper functions
β”‚   └── main.py              # Main application entry point
β”œβ”€β”€ data/                    # Data storage
β”‚   β”œβ”€β”€ documents/           # Original documents
β”‚   └── processed/           # Processed documents and embeddings
β”œβ”€β”€ tests/                   # Tests
β”‚   β”œβ”€β”€ test_api.py
β”‚   β”œβ”€β”€ test_document_processing.py
β”‚   └── test_rag.py
β”œβ”€β”€ venv/                    # Virtual environment
β”œβ”€β”€ requirements-ultra-light.txt  # Lightweight dependencies
β”œβ”€β”€ requirements.txt         # Original requirements (for reference)
└── README.md                # Project documentation
```

## Key Components

### 1. API Integration (`src/api/`)
- `huggingface_api.py`: Integration with Hugging Face Inference API for both LLM and embedding models
- `config.py`: Configuration for API endpoints, model IDs, and API keys

### 2. Document Processing (`src/document_processing/`)
- `extractor.py`: Extract text from various document formats
- `chunker.py`: Split documents into manageable chunks
- `processor.py`: Orchestrate the document processing pipeline

### 3. RAG Implementation (`src/rag/`)
- `retriever.py`: Retrieve relevant document chunks based on query
- `generator.py`: Generate responses using retrieved context

### 4. Web Interface (`src/web/`)
- `app.py`: Gradio web interface for the chatbot
- `embed.py`: Generate embedding code for website integration

### 5. Main Application (`src/main.py`)
- Entry point for the application
- Orchestrates the different components

## Implementation Approach

1. **Remote Model Execution**: Use Hugging Face's Inference API for both LLM and embedding models
2. **Lightweight Document Processing**: Process documents locally but use remote APIs for embedding generation
3. **Simple Vector Storage**: Store embeddings in simple file-based format rather than dedicated vector database
4. **Gradio Interface**: Create a simple but effective chat interface using Gradio
5. **Hugging Face Spaces Deployment**: Deploy the final solution to Hugging Face Spaces