Spaces:
Sleeping
Sleeping
File size: 9,959 Bytes
0a4262a fc456a9 664b898 66fda8a fc456a9 664b898 fc456a9 664b898 fc456a9 664b898 fc456a9 664b898 fc456a9 664b898 dd98df9 664b898 fc456a9 664b898 fc456a9 664b898 66fda8a |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 |
---
title: Search Engine
emoji: ๐
colorFrom: blue
colorTo: indigo
sdk: docker
pinned: false
---
# Prompt Search Engine
## Table of Contents
1. [Project Overview](#project-overview)
2. [Environment Setup](#environment-setup)
3. [Run the Project](#run-the-project)
4. [API Endpoints and Usage](#api-endpoints-and-usage)
5. [Instructions for Building and Running the Docker Container](#instructions-for-building-and-running-the-docker-container)
6. [Deployment Details](#deployment-details)
7. [Running Tests](#running-tests)
8. [Information on How to Use the UI](#information-on-how-to-use-the-ui)
9. [Future Improvements](#future-improvements)
---
## Project Overview
The Prompt Search Engine is designed to address the growing need for high-quality prompts used in AI-generated content,
particularly for models like Stable Diffusion. By leveraging a database of existing prompts,
this search engine helps users discover the most relevant and effective prompts, significantly enhancing the quality of generated images.
The main goal of the prompt search engine is to return the top n most similar prompts with respect to the input prompt query.
This way, we can generate higher quality images by providing better prompts for the Stable Diffusion models.
### Technology Used
This project leverages a modern tech stack to deliver efficient search functionality:
1. **FastAPI**: A high-performance web framework for building the backend API.
2. **Gradio**: A lightweight UI framework for creating the frontend interface.
3. **Hugging Face Spaces**: For hosting the application using Docker.
4. **Hugging Face Datasets**: Downloads and processes the `google-research-datasets/conceptual_captions` dataset at runtime.
5. **Uvicorn**: ASGI server for running the FastAPI application.
6. **Python**: Core language used for development and scripting.
---
## Environment Setup
To set up the environment for the Prompt Search Engine, follow these steps:
### Prerequisites
1. **Python**: Ensure Python >= 3.9 is installed. You can download it from [Python.org](https://www.python.org/downloads/).
2. **Docker**: Install Docker to containerize and deploy the application. Visit [Docker's official site](https://www.docker.com/get-started) for installation instructions.
3. **Conda (Optional)**: Install Miniconda or Anaconda for managing a virtual environment locally.
### Steps to Install Dependencies
1. Navigate to the project directory:
```bash
cd <project-directory>
```
2. Create and activate a Conda environment (optional):
```bash
conda create -n prompt_search_env python={version} -y
conda activate prompt_search_env
```
- Replace `{version}` with your desired Python version (e.g., 3.9).
3. Install dependencies inside the Conda environment using `pip`:
```bash
pip install -r requirements.txt
```
4. Review and update the `config.py` file to match your environment, such as specifying API keys or dataset paths.
## Run the Project
You can run the application locally using either a Conda environment or Docker:
- **Using Conda Environment:**
1. Start the backend API. Swagger documentation will be accessible at `http://0.0.0.0:8000/docs`:
```bash
python run.py
```
2. Run the frontend application:
```bash
python -m fe.gradio_app
```
The frontend will be accessible at `http://0.0.0.0:7860`.
- **Using Docker:**
Refer to the instructions in the next section for building and running the Docker container.
## Instructions for Building and Running the Docker Container
1. Build the Docker image:
```bash
docker build -t prompt-search-engine .
```
2. Run the Docker container:
```bash
docker run -p 8000:8000 -p 7860:7860 prompt-search-engine
```
- The backend API will be accessible at `http://0.0.0.0:8000/docs`.
- The frontend will be accessible at `http://0.0.0.0:7860`.
Your environment is now ready to use the Prompt Search Engine.
## API Endpoints and Usage
### `/search` (GET)
Endpoint for querying the search engine.
#### Parameters:
- `query` (str): The search query. **Required**.
- `n` (int): Number of results to return (default: 5). Must be greater than or equal to 1.
#### Example Request:
```bash
curl -X GET "http://0.0.0.0:8000/search?query=example+prompt&n=5"
```
#### Example Response:
```json
{
"query": "example prompt",
"results": [
{"score": 0.95, "prompt": "example similar prompt 1"},
{"score": 0.92, "prompt": "example similar prompt 2"}
]
}
```
---
## Deployment Details
### Overview
This section outlines the steps to deploy the **Prompt Search Engine** application using Docker and Hugging Face Spaces. The application comprises a backend (API) and a frontend (Gradio-based UI) that runs together in a single Docker container.
### Prerequisites
1. A [Hugging Face account](https://huggingface.co/).
2. Git installed locally.
3. Access to the project repository on GitHub.
4. Docker installed locally for testing.
5. A Hugging Face **Access Token** (needed for authentication).
### Deployment Steps
1. **Create a Hugging Face Space:**
- Log in to [Hugging Face Spaces](https://huggingface.co/spaces).
- Click on **Create Space**.
- Fill in the details:
- **Space Name**: Choose a name like `promptsearchengine`.
- **SDK**: Select `Docker`.
- **Visibility**: Choose between public or private.
- Click **Create Space** to generate a new repository.
2. **Create a Hugging Face Access Token:**
- Log in to [Hugging Face](https://huggingface.co/).
- Navigate to **Settings** > **Access Tokens**.
- Click **New Token**:
- **Name**: `Promptsearchengine Deployment`.
- **Role**: Select `Write`.
- Copy the token. Youโll need it for pushing to Hugging Face Spaces.
3. **Test the Application Locally:**
```bash
docker build -t promptsearchengine .
docker run -p 8000:8000 -p 7860:7860 promptsearchengine
```
- **Backend**: Test at `http://localhost:8000`.
- **Frontend**: Test at `http://localhost:7860`.
4. **Prepare the Project for Hugging Face Spaces:**
- Ensure the `Dockerfile` is updated for Hugging Face Spaces:
- Set environment variables for writable directories (e.g., `HF_HOME=/tmp/huggingface`).
- Ensure a valid `README.md` is present at the root with the Hugging Face configuration:
```markdown
---
title: Promptsearchengine
emoji: ๐
colorFrom: blue
colorTo: indigo
sdk: docker
pinned: false
---
```
5. **Push the Project to Hugging Face Spaces:**
```bash
git remote add space https://huggingface.co/spaces/<your-username>/promptsearchengine
git push space main
```
6. **Monitor the Build Logs:**
- Navigate to your Space on Hugging Face.
- Monitor the "Logs" tab to ensure the build completes successfully.
### Testing the Deployment
Once deployed, test the application on `https://huggingface.co/spaces/<your-username>/promptsearchengine`.
## Running Tests
Execute all the tests by running in the terminal within your local project environment:
```bash
python -m pytest -vv tests/
```
### Test Structure
- **Unit Tests**: Focus on isolated functionality, like individual endpoints or methods.
- **Integration Tests**: Verify end-to-end behavior using real components.
## Information on How to Use the UI
The **Prompt Search Engine** interface is designed for simplicity and ease of use. Follow these steps to interact with the application:
1. **Enter Your Query**:
- In the "Enter your query" field, type a phrase or keywords for which you want to find related prompts.
2. **Set the Number of Results**:
- Use the "Number of top results" field to specify how many similar prompts you want to retrieve. Default is 5.
3. **Submit a Query**:
- Click the **Search** button to execute your query and display results in real-time.
4. **View Results**:
- The results will display in a table with the following columns:
- **Prompt**: The retrieved prompts that are most similar to your query.
- **Similarity**: The similarity score between your query and each retrieved prompt.
5. **Interpreting Results**:
- Higher similarity scores indicate a closer match to your query.
- Use these prompts to refine or inspire new input for your task.
The clean, dark theme is optimized for readability, making it easier to analyze and use the results effectively.
## Future Improvements
1. **Replace Print Statements with Logging**
- Integrate the `logging` module to replace print statements for better debugging, configurability, and structured logs.
2. **GitHub Workflow for Continuous Integration**
- Set up GitHub Actions to automatically sync the GitHub repository with the Hugging Face Space. This will streamline deployment processes and ensure consistency across versions.
3. **Support for Predownloaded Datasets**
- Add options to use predownloaded datasets instead of downloading them at runtime, enhancing usability for restricted or offline environments.
4. **Code Refactoring**
- Extract remaining hardcoded values into a constants module or configuration file for better maintainability.
5. **Improve Prompt Corpus Handling**
- Make the current limitation on the number of prompts configurable via a parameter or remove it entirely if unnecessary to provide users with greater flexibility.
6. **Database or Persistent Storage**
- Explore integrating a database for persistent storage, moving away from reliance on runtime memory or temporary files to enhance scalability and reliability.
7. **Enhance Unit Testing**
- Expand the test coverage for edge cases and performance testing;
- Automate test execution with GitHub Actions to maintain code quality and reliability.
8. **Frontend Enhancements**
- Focus on general improvements to the Gradio frontend, such as better customization, theming, and user experience.
|