Spaces:
Sleeping
Sleeping
title: Search Engine | |
emoji: 🔍 | |
colorFrom: blue | |
colorTo: indigo | |
sdk: docker | |
pinned: false | |
# Prompt Search Engine | |
## Table of Contents | |
1. [Project Overview](#project-overview) | |
2. [Environment Setup](#environment-setup) | |
3. [Run the Project](#run-the-project) | |
4. [API Endpoints and Usage](#api-endpoints-and-usage) | |
5. [Instructions for Building and Running the Docker Container](#instructions-for-building-and-running-the-docker-container) | |
6. [Deployment Details](#deployment-details) | |
7. [Running Tests](#running-tests) | |
8. [Information on How to Use the UI](#information-on-how-to-use-the-ui) | |
9. [Future Improvements](#future-improvements) | |
--- | |
## Project Overview | |
The Prompt Search Engine is designed to address the growing need for high-quality prompts used in AI-generated content, | |
particularly for models like Stable Diffusion. By leveraging a database of existing prompts, | |
this search engine helps users discover the most relevant and effective prompts, significantly enhancing the quality of generated images. | |
The main goal of the prompt search engine is to return the top n most similar prompts with respect to the input prompt query. | |
This way, we can generate higher quality images by providing better prompts for the Stable Diffusion models. | |
### Technology Used | |
This project leverages a modern tech stack to deliver efficient search functionality: | |
1. **FastAPI**: A high-performance web framework for building the backend API. | |
2. **Gradio**: A lightweight UI framework for creating the frontend interface. | |
3. **Hugging Face Spaces**: For hosting the application using Docker. | |
4. **Hugging Face Datasets**: Downloads and processes the `google-research-datasets/conceptual_captions` dataset at runtime. | |
5. **Uvicorn**: ASGI server for running the FastAPI application. | |
6. **Python**: Core language used for development and scripting. | |
--- | |
## Environment Setup | |
To set up the environment for the Prompt Search Engine, follow these steps: | |
### Prerequisites | |
1. **Python**: Ensure Python >= 3.9 is installed. You can download it from [Python.org](https://www.python.org/downloads/). | |
2. **Docker**: Install Docker to containerize and deploy the application. Visit [Docker's official site](https://www.docker.com/get-started) for installation instructions. | |
3. **Conda (Optional)**: Install Miniconda or Anaconda for managing a virtual environment locally. | |
### Steps to Install Dependencies | |
1. Navigate to the project directory: | |
```bash | |
cd <project-directory> | |
``` | |
2. Create and activate a Conda environment (optional): | |
```bash | |
conda create -n prompt_search_env python={version} -y | |
conda activate prompt_search_env | |
``` | |
- Replace `{version}` with your desired Python version (e.g., 3.9). | |
3. Install dependencies inside the Conda environment using `pip`: | |
```bash | |
pip install -r requirements.txt | |
``` | |
4. Review and update the `config.py` file to match your environment, such as specifying API keys or dataset paths. | |
## Run the Project | |
You can run the application locally using either a Conda environment or Docker: | |
- **Using Conda Environment:** | |
1. Start the backend API. Swagger documentation will be accessible at `http://0.0.0.0:8000/docs`: | |
```bash | |
python run.py | |
``` | |
2. Run the frontend application: | |
```bash | |
python -m fe.gradio_app | |
``` | |
The frontend will be accessible at `http://0.0.0.0:7860`. | |
- **Using Docker:** | |
Refer to the instructions in the next section for building and running the Docker container. | |
## Instructions for Building and Running the Docker Container | |
1. Build the Docker image: | |
```bash | |
docker build -t prompt-search-engine . | |
``` | |
2. Run the Docker container: | |
```bash | |
docker run -p 8000:8000 -p 7860:7860 prompt-search-engine | |
``` | |
- The backend API will be accessible at `http://0.0.0.0:8000/docs`. | |
- The frontend will be accessible at `http://0.0.0.0:7860`. | |
Your environment is now ready to use the Prompt Search Engine. | |
## API Endpoints and Usage | |
### `/search` (GET) | |
Endpoint for querying the search engine. | |
#### Parameters: | |
- `query` (str): The search query. **Required**. | |
- `n` (int): Number of results to return (default: 5). Must be greater than or equal to 1. | |
#### Example Request: | |
```bash | |
curl -X GET "http://0.0.0.0:8000/search?query=example+prompt&n=5" | |
``` | |
#### Example Response: | |
```json | |
{ | |
"query": "example prompt", | |
"results": [ | |
{"score": 0.95, "prompt": "example similar prompt 1"}, | |
{"score": 0.92, "prompt": "example similar prompt 2"} | |
] | |
} | |
``` | |
--- | |
## Deployment Details | |
### Overview | |
This section outlines the steps to deploy the **Prompt Search Engine** application using Docker and Hugging Face Spaces. The application comprises a backend (API) and a frontend (Gradio-based UI) that runs together in a single Docker container. | |
### Prerequisites | |
1. A [Hugging Face account](https://huggingface.co/). | |
2. Git installed locally. | |
3. Access to the project repository on GitHub. | |
4. Docker installed locally for testing. | |
5. A Hugging Face **Access Token** (needed for authentication). | |
### Deployment Steps | |
1. **Create a Hugging Face Space:** | |
- Log in to [Hugging Face Spaces](https://huggingface.co/spaces). | |
- Click on **Create Space**. | |
- Fill in the details: | |
- **Space Name**: Choose a name like `promptsearchengine`. | |
- **SDK**: Select `Docker`. | |
- **Visibility**: Choose between public or private. | |
- Click **Create Space** to generate a new repository. | |
2. **Create a Hugging Face Access Token:** | |
- Log in to [Hugging Face](https://huggingface.co/). | |
- Navigate to **Settings** > **Access Tokens**. | |
- Click **New Token**: | |
- **Name**: `Promptsearchengine Deployment`. | |
- **Role**: Select `Write`. | |
- Copy the token. You’ll need it for pushing to Hugging Face Spaces. | |
3. **Test the Application Locally:** | |
```bash | |
docker build -t promptsearchengine . | |
docker run -p 8000:8000 -p 7860:7860 promptsearchengine | |
``` | |
- **Backend**: Test at `http://localhost:8000`. | |
- **Frontend**: Test at `http://localhost:7860`. | |
4. **Prepare the Project for Hugging Face Spaces:** | |
- Ensure the `Dockerfile` is updated for Hugging Face Spaces: | |
- Set environment variables for writable directories (e.g., `HF_HOME=/tmp/huggingface`). | |
- Ensure a valid `README.md` is present at the root with the Hugging Face configuration: | |
```markdown | |
--- | |
title: Promptsearchengine | |
emoji: 🔍 | |
colorFrom: blue | |
colorTo: indigo | |
sdk: docker | |
pinned: false | |
--- | |
``` | |
5. **Push the Project to Hugging Face Spaces:** | |
```bash | |
git remote add space https://huggingface.co/spaces/<your-username>/promptsearchengine | |
git push space main | |
``` | |
6. **Monitor the Build Logs:** | |
- Navigate to your Space on Hugging Face. | |
- Monitor the "Logs" tab to ensure the build completes successfully. | |
### Testing the Deployment | |
Once deployed, test the application on `https://huggingface.co/spaces/<your-username>/promptsearchengine`. | |
## Running Tests | |
Execute all the tests by running in the terminal within your local project environment: | |
```bash | |
python -m pytest -vv tests/ | |
``` | |
### Test Structure | |
- **Unit Tests**: Focus on isolated functionality, like individual endpoints or methods. | |
- **Integration Tests**: Verify end-to-end behavior using real components. | |
## Information on How to Use the UI | |
The **Prompt Search Engine** interface is designed for simplicity and ease of use. Follow these steps to interact with the application: | |
1. **Enter Your Query**: | |
- In the "Enter your query" field, type a phrase or keywords for which you want to find related prompts. | |
2. **Set the Number of Results**: | |
- Use the "Number of top results" field to specify how many similar prompts you want to retrieve. Default is 5. | |
3. **Submit a Query**: | |
- Click the **Search** button to execute your query and display results in real-time. | |
4. **View Results**: | |
- The results will display in a table with the following columns: | |
- **Prompt**: The retrieved prompts that are most similar to your query. | |
- **Similarity**: The similarity score between your query and each retrieved prompt. | |
5. **Interpreting Results**: | |
- Higher similarity scores indicate a closer match to your query. | |
- Use these prompts to refine or inspire new input for your task. | |
The clean, dark theme is optimized for readability, making it easier to analyze and use the results effectively. | |
## Future Improvements | |
1. **Replace Print Statements with Logging** | |
- Integrate the `logging` module to replace print statements for better debugging, configurability, and structured logs. | |
2. **GitHub Workflow for Continuous Integration** | |
- Set up GitHub Actions to automatically sync the GitHub repository with the Hugging Face Space. This will streamline deployment processes and ensure consistency across versions. | |
3. **Support for Predownloaded Datasets** | |
- Add options to use predownloaded datasets instead of downloading them at runtime, enhancing usability for restricted or offline environments. | |
4. **Code Refactoring** | |
- Extract remaining hardcoded values into a constants module or configuration file for better maintainability. | |
5. **Improve Prompt Corpus Handling** | |
- Make the current limitation on the number of prompts configurable via a parameter or remove it entirely if unnecessary to provide users with greater flexibility. | |
6. **Database or Persistent Storage** | |
- Explore integrating a database for persistent storage, moving away from reliance on runtime memory or temporary files to enhance scalability and reliability. | |
7. **Enhance Unit Testing** | |
- Expand the test coverage for edge cases and performance testing; | |
- Automate test execution with GitHub Actions to maintain code quality and reliability. | |
8. **Frontend Enhancements** | |
- Focus on general improvements to the Gradio frontend, such as better customization, theming, and user experience. | |