Spaces:
Running
Running
File size: 4,399 Bytes
524c601 aa6982c 7332f5a aa4ced0 7332f5a aa4ced0 7332f5a fbe2bb9 aa6982c dcf6544 aa6982c 524c601 aa6982c 524c601 aa6982c dcf6544 aa6982c 524c601 aa6982c 524c601 aa6982c dcf6544 7332f5a e011c41 7332f5a e011c41 7332f5a e011c41 7332f5a e011c41 7332f5a e011c41 7332f5a e011c41 7332f5a dcf6544 7332f5a e011c41 7332f5a e011c41 7332f5a e011c41 7332f5a e011c41 7332f5a e011c41 7332f5a e011c41 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 |
---
title: LoRACaptioner
emoji: 🤠
colorFrom: red
colorTo: green
sdk: gradio
sdk_version: 5.25.2
app_file: demo.py
pinned: false
---
# LoRACaptioner
- **Image Captioning**: Automatically generate detailed and structured captions for your LoRA dataset.
- **Prompt Optimization**: Enhance prompts during inference to achieve high-quality outputs.
<table>
<tr>
<td><img src="examples/sukuna_4.png" alt="Sukuna example 4" width="400"></td>
<td><img src="examples/sukuna_5.png" alt="Sukuna example 5" width="400"></td>
</tr>
<tr>
<td><img src="examples/sukuna_6.png" alt="Sukuna example 6" width="400"></td>
<td><img src="examples/sukuna_7.png" alt="Sukuna example 7" width="400"></td>
</tr>
</table>
## Installation
### Prerequisites
- Python 3.10 or higher
- [Together AI](https://together.ai/) account and API key
### Setup
1. Create the virtual environment:
```bash
python -m venv venv
source venv/bin/activate
python -m pip install -r requirements.txt
```
2. Run inference on one set of images:
```bash
python main.py --input examples/ --output output/
```
<details>
<summary>Arguments</summary>
- `--input` (str): Directory containing images to caption.
- `--output` (str): Directory to save images and captions (defaults to input directory).
- `--batch_images` (flag): Caption images in batches by category.
</details>
## Gradio Demo
Launch a user-friendly web interface for captioning and prompt optimization:
```bash
python demo.py
```
### Notes
- Images are processed individually in standard mode
- For large collections, batch processing by category is recommended
- Each caption is saved as a .txt file with the same name as the image
### Troubleshooting
- **API errors**: Ensure your Together API key is set and has funds
- **Image formats**: Only .png, .jpg, .jpeg, and .webp files are supported
## Manual Captioning with ChatGPT
Follow the instructions in my [blog post](https://rishidesai.github.io/posts/character-lora/) and use `system_prompt.txt` as the system prompt.
## Examples
### Sukuna from Jujutsu Kaisen
**User Prompt:**
holding a bow and arrow in a dense forest
**Optimized Prompt:**
`tr1gg3r anime-style, pink spiky hair and black markings on face, shirtless with dark arm bands, holding bow and arrow, focused expression, dense forest, soft dappled lighting, three-quarter view`
<img src="examples/sukuna_1.png" alt="Sukuna with bow and arrow" width="500">
---
**User Prompt:**
drinking coffee in a san francisco cafe, white cloak, side view
**Optimized Prompt:**
`tr1gg3r anime-style, spiky pink hair and facial markings, white cloak, sitting with cup in hand, neutral expression, cafe interior with san francisco view, soft natural lighting, side profile`
<img src="examples/sukuna_2.png" alt="Sukuna drinking coffee" width="500">
---
**User Prompt:**
playing pick-up basketball on a sunny day
**Optimized Prompt:**
`tr1gg3r photorealistic, athletic build, sleeveless basketball jersey and shorts, jumping with ball, focused expression, outdoor basketball court with spectators, bright sunlight, low-angle view`
<img src="examples/sukuna_3.png" alt="Sukuna playing basketball" width="500">
---
### A character generated by Flux.1-dev
**User Prompt:**
riding a horse on a prairie during sunset
**Optimized Prompt:**
`tr1gger photorealistic, curly shoulder-length hair, floral button-up shirt, riding a horse, neutral expression, prairie during sunset, warm directional lighting, three-quarter view`
<img src="examples/woman_1.png" alt="Woman riding a horse" width="500">
---
**User Prompt:**
painting on a canvas in an art studio, side-view
**Optimized Prompt:**
`tr1gg3r photorealistic, curly shoulder-length hair, floral button-up shirt, standing at an angle with brush in hand, neutral expression, art studio with canvas and paints, soft natural lighting, right side profile`
<img src="examples/woman_2.png" alt="Woman painting in studio" width="500">
---
**User Prompt:**
standing on a skyscraper in a dense city, dramatic stormy lighting, rear view
**Optimized Prompt:**
`tr1gg3r photorealistic, curly shoulder-length hair, floral button-up shirt, standing upright, neutral expression, skyscraper rooftop in dense city, dramatic stormy lighting, back view`
<img src="examples/woman_3.png" alt="Woman on skyscraper" width="500">
|