Spaces:
Running
Running
title: LoRACaptioner | |
emoji: 🤠 | |
colorFrom: red | |
colorTo: green | |
sdk: gradio | |
sdk_version: 5.25.2 | |
app_file: demo.py | |
pinned: false | |
# LoRACaptioner | |
- **Image Captioning**: Automatically generate detailed and structured captions for your LoRA dataset. | |
- **Prompt Optimization**: Enhance prompts during inference to achieve high-quality outputs. | |
<table> | |
<tr> | |
<td><img src="examples/sukuna_4.png" alt="Sukuna example 4" width="400"></td> | |
<td><img src="examples/sukuna_5.png" alt="Sukuna example 5" width="400"></td> | |
</tr> | |
<tr> | |
<td><img src="examples/sukuna_6.png" alt="Sukuna example 6" width="400"></td> | |
<td><img src="examples/sukuna_7.png" alt="Sukuna example 7" width="400"></td> | |
</tr> | |
</table> | |
## Installation | |
### Prerequisites | |
- Python 3.10 or higher | |
- [Together AI](https://together.ai/) account and API key | |
### Setup | |
1. Create the virtual environment: | |
```bash | |
python -m venv venv | |
source venv/bin/activate | |
python -m pip install -r requirements.txt | |
``` | |
2. Run inference on one set of images: | |
```bash | |
python main.py --input examples/ --output output/ | |
``` | |
<details> | |
<summary>Arguments</summary> | |
- `--input` (str): Directory containing images to caption. | |
- `--output` (str): Directory to save images and captions (defaults to input directory). | |
- `--batch_images` (flag): Caption images in batches by category. | |
</details> | |
## Gradio Demo | |
Launch a user-friendly web interface for captioning and prompt optimization: | |
```bash | |
python demo.py | |
``` | |
### Notes | |
- Images are processed individually in standard mode | |
- For large collections, batch processing by category is recommended | |
- Each caption is saved as a .txt file with the same name as the image | |
### Troubleshooting | |
- **API errors**: Ensure your Together API key is set and has funds | |
- **Image formats**: Only .png, .jpg, .jpeg, and .webp files are supported | |
## Manual Captioning with ChatGPT | |
Follow the instructions in my [blog post](https://rishidesai.github.io/posts/character-lora/) and use `system_prompt.txt` as the system prompt. | |
## Examples | |
### Sukuna from Jujutsu Kaisen | |
**User Prompt:** | |
holding a bow and arrow in a dense forest | |
**Optimized Prompt:** | |
`tr1gg3r anime-style, pink spiky hair and black markings on face, shirtless with dark arm bands, holding bow and arrow, focused expression, dense forest, soft dappled lighting, three-quarter view` | |
<img src="examples/sukuna_1.png" alt="Sukuna with bow and arrow" width="500"> | |
--- | |
**User Prompt:** | |
drinking coffee in a san francisco cafe, white cloak, side view | |
**Optimized Prompt:** | |
`tr1gg3r anime-style, spiky pink hair and facial markings, white cloak, sitting with cup in hand, neutral expression, cafe interior with san francisco view, soft natural lighting, side profile` | |
<img src="examples/sukuna_2.png" alt="Sukuna drinking coffee" width="500"> | |
--- | |
**User Prompt:** | |
playing pick-up basketball on a sunny day | |
**Optimized Prompt:** | |
`tr1gg3r photorealistic, athletic build, sleeveless basketball jersey and shorts, jumping with ball, focused expression, outdoor basketball court with spectators, bright sunlight, low-angle view` | |
<img src="examples/sukuna_3.png" alt="Sukuna playing basketball" width="500"> | |
--- | |
### A character generated by Flux.1-dev | |
**User Prompt:** | |
riding a horse on a prairie during sunset | |
**Optimized Prompt:** | |
`tr1gger photorealistic, curly shoulder-length hair, floral button-up shirt, riding a horse, neutral expression, prairie during sunset, warm directional lighting, three-quarter view` | |
<img src="examples/woman_1.png" alt="Woman riding a horse" width="500"> | |
--- | |
**User Prompt:** | |
painting on a canvas in an art studio, side-view | |
**Optimized Prompt:** | |
`tr1gg3r photorealistic, curly shoulder-length hair, floral button-up shirt, standing at an angle with brush in hand, neutral expression, art studio with canvas and paints, soft natural lighting, right side profile` | |
<img src="examples/woman_2.png" alt="Woman painting in studio" width="500"> | |
--- | |
**User Prompt:** | |
standing on a skyscraper in a dense city, dramatic stormy lighting, rear view | |
**Optimized Prompt:** | |
`tr1gg3r photorealistic, curly shoulder-length hair, floral button-up shirt, standing upright, neutral expression, skyscraper rooftop in dense city, dramatic stormy lighting, back view` | |
<img src="examples/woman_3.png" alt="Woman on skyscraper" width="500"> | |