File size: 9,959 Bytes
0a4262a
 
 
 
 
 
 
 
 
 
fc456a9
 
 
 
 
 
 
 
 
664b898
 
66fda8a
fc456a9
 
 
 
664b898
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
fc456a9
 
 
 
 
664b898
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
fc456a9
 
 
664b898
fc456a9
664b898
 
 
 
 
 
 
 
 
 
 
 
 
fc456a9
 
 
664b898
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
dd98df9
664b898
 
 
 
 
 
 
 
 
 
 
 
fc456a9
 
 
 
664b898
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
fc456a9
 
664b898
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
66fda8a
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
---
title: Search Engine
emoji: ๐Ÿ”
colorFrom: blue
colorTo: indigo
sdk: docker
pinned: false
---


# Prompt Search Engine

## Table of Contents
1. [Project Overview](#project-overview)
2. [Environment Setup](#environment-setup)
3. [Run the Project](#run-the-project)
4. [API Endpoints and Usage](#api-endpoints-and-usage)
5. [Instructions for Building and Running the Docker Container](#instructions-for-building-and-running-the-docker-container)
6. [Deployment Details](#deployment-details)
7. [Running Tests](#running-tests)
8. [Information on How to Use the UI](#information-on-how-to-use-the-ui)
9. [Future Improvements](#future-improvements)

---

## Project Overview
The Prompt Search Engine is designed to address the growing need for high-quality prompts used in AI-generated content, 
particularly for models like Stable Diffusion. By leveraging a database of existing prompts, 
this search engine helps users discover the most relevant and effective prompts, significantly enhancing the quality of generated images.

The main goal of the prompt search engine is to return the top n most similar prompts with respect to the input prompt query. 
This way, we can generate higher quality images by providing better prompts for the Stable Diffusion models.

### Technology Used

This project leverages a modern tech stack to deliver efficient search functionality:

1. **FastAPI**: A high-performance web framework for building the backend API.
2. **Gradio**: A lightweight UI framework for creating the frontend interface.
3. **Hugging Face Spaces**: For hosting the application using Docker.
4. **Hugging Face Datasets**: Downloads and processes the `google-research-datasets/conceptual_captions` dataset at runtime.
5. **Uvicorn**: ASGI server for running the FastAPI application.
6. **Python**: Core language used for development and scripting.

---

## Environment Setup

To set up the environment for the Prompt Search Engine, follow these steps:

### Prerequisites

1. **Python**: Ensure Python >= 3.9 is installed. You can download it from [Python.org](https://www.python.org/downloads/).
2. **Docker**: Install Docker to containerize and deploy the application. Visit [Docker's official site](https://www.docker.com/get-started) for installation instructions.
3. **Conda (Optional)**: Install Miniconda or Anaconda for managing a virtual environment locally.

### Steps to Install Dependencies

1. Navigate to the project directory:
   ```bash
   cd <project-directory>
   ```

2. Create and activate a Conda environment (optional):
   ```bash
   conda create -n prompt_search_env python={version} -y
   conda activate prompt_search_env
   ```
   - Replace `{version}` with your desired Python version (e.g., 3.9).

3. Install dependencies inside the Conda environment using `pip`:
   ```bash
   pip install -r requirements.txt
   ```

4. Review and update the `config.py` file to match your environment, such as specifying API keys or dataset paths.

## Run the Project

You can run the application locally using either a Conda environment or Docker:

- **Using Conda Environment:**
  1. Start the backend API. Swagger documentation will be accessible at `http://0.0.0.0:8000/docs`:
     ```bash
     python run.py
     ```
  2. Run the frontend application:
     ```bash
     python -m fe.gradio_app
     ```
  The frontend will be accessible at `http://0.0.0.0:7860`.

- **Using Docker:**
  Refer to the instructions in the next section for building and running the Docker container.

## Instructions for Building and Running the Docker Container

1. Build the Docker image:
   ```bash
   docker build -t prompt-search-engine .
   ```

2. Run the Docker container:
   ```bash
   docker run -p 8000:8000 -p 7860:7860 prompt-search-engine
   ```

   - The backend API will be accessible at `http://0.0.0.0:8000/docs`.
   - The frontend will be accessible at `http://0.0.0.0:7860`.

Your environment is now ready to use the Prompt Search Engine.

## API Endpoints and Usage

### `/search` (GET)
Endpoint for querying the search engine.

#### Parameters:
- `query` (str): The search query. **Required**.
- `n` (int): Number of results to return (default: 5). Must be greater than or equal to 1.

#### Example Request:
```bash
curl -X GET "http://0.0.0.0:8000/search?query=example+prompt&n=5"
```

#### Example Response:
```json
{
    "query": "example prompt",
    "results": [
        {"score": 0.95, "prompt": "example similar prompt 1"},
        {"score": 0.92, "prompt": "example similar prompt 2"}
    ]
}
```

---

## Deployment Details

### Overview
This section outlines the steps to deploy the **Prompt Search Engine** application using Docker and Hugging Face Spaces. The application comprises a backend (API) and a frontend (Gradio-based UI) that runs together in a single Docker container.

### Prerequisites

1. A [Hugging Face account](https://huggingface.co/).
2. Git installed locally.
3. Access to the project repository on GitHub.
4. Docker installed locally for testing.
5. A Hugging Face **Access Token** (needed for authentication).

### Deployment Steps

1. **Create a Hugging Face Space:**
   - Log in to [Hugging Face Spaces](https://huggingface.co/spaces).
   - Click on **Create Space**.
   - Fill in the details:
     - **Space Name**: Choose a name like `promptsearchengine`.
     - **SDK**: Select `Docker`.
     - **Visibility**: Choose between public or private.
   - Click **Create Space** to generate a new repository.

2. **Create a Hugging Face Access Token:**
   - Log in to [Hugging Face](https://huggingface.co/).
   - Navigate to **Settings** > **Access Tokens**.
   - Click **New Token**:
     - **Name**: `Promptsearchengine Deployment`.
     - **Role**: Select `Write`.
   - Copy the token. Youโ€™ll need it for pushing to Hugging Face Spaces.

3. **Test the Application Locally:**

   ```bash
   docker build -t promptsearchengine . 
   docker run -p 8000:8000 -p 7860:7860 promptsearchengine
   ```

   - **Backend**: Test at `http://localhost:8000`.
   - **Frontend**: Test at `http://localhost:7860`.

4. **Prepare the Project for Hugging Face Spaces:**

   - Ensure the `Dockerfile` is updated for Hugging Face Spaces:
     - Set environment variables for writable directories (e.g., `HF_HOME=/tmp/huggingface`).
   - Ensure a valid `README.md` is present at the root with the Hugging Face configuration:
     ```markdown
     ---
     title: Promptsearchengine
     emoji: ๐Ÿ”
     colorFrom: blue
     colorTo: indigo
     sdk: docker
     pinned: false
     ---
     ```

5. **Push the Project to Hugging Face Spaces:**

    ```bash
    git remote add space https://huggingface.co/spaces/<your-username>/promptsearchengine
    git push space main
    ```

6. **Monitor the Build Logs:**

    - Navigate to your Space on Hugging Face.
    - Monitor the "Logs" tab to ensure the build completes successfully.

### Testing the Deployment

Once deployed, test the application on `https://huggingface.co/spaces/<your-username>/promptsearchengine`.


## Running Tests

Execute all the tests by running in the terminal within your local project environment:

```bash
python -m pytest -vv tests/
```

### Test Structure

- **Unit Tests**: Focus on isolated functionality, like individual endpoints or methods.
- **Integration Tests**: Verify end-to-end behavior using real components.


## Information on How to Use the UI

The **Prompt Search Engine** interface is designed for simplicity and ease of use. Follow these steps to interact with the application:

1. **Enter Your Query**: 
   - In the "Enter your query" field, type a phrase or keywords for which you want to find related prompts.
2. **Set the Number of Results**: 
   - Use the "Number of top results" field to specify how many similar prompts you want to retrieve. Default is 5.
3. **Submit a Query**:
   - Click the **Search** button to execute your query and display results in real-time.
4. **View Results**:
   - The results will display in a table with the following columns:
     - **Prompt**: The retrieved prompts that are most similar to your query.
     - **Similarity**: The similarity score between your query and each retrieved prompt.
5. **Interpreting Results**:
   - Higher similarity scores indicate a closer match to your query.
   - Use these prompts to refine or inspire new input for your task.

The clean, dark theme is optimized for readability, making it easier to analyze and use the results effectively.


## Future Improvements

1. **Replace Print Statements with Logging**
   - Integrate the `logging` module to replace print statements for better debugging, configurability, and structured logs.

2. **GitHub Workflow for Continuous Integration**
   - Set up GitHub Actions to automatically sync the GitHub repository with the Hugging Face Space. This will streamline deployment processes and ensure consistency across versions.

3. **Support for Predownloaded Datasets**
   - Add options to use predownloaded datasets instead of downloading them at runtime, enhancing usability for restricted or offline environments.

4. **Code Refactoring**
   - Extract remaining hardcoded values into a constants module or configuration file for better maintainability.

5. **Improve Prompt Corpus Handling**
   - Make the current limitation on the number of prompts configurable via a parameter or remove it entirely if unnecessary to provide users with greater flexibility.

6. **Database or Persistent Storage**
   - Explore integrating a database for persistent storage, moving away from reliance on runtime memory or temporary files to enhance scalability and reliability.

7. **Enhance Unit Testing**
   - Expand the test coverage for edge cases and performance testing;
   - Automate test execution with GitHub Actions to maintain code quality and reliability.

8. **Frontend Enhancements**
   - Focus on general improvements to the Gradio frontend, such as better customization, theming, and user experience.