Spaces:
Running
Running
File size: 6,315 Bytes
fe5c39d |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 |
# SPO | Self-Supervised Prompt Optimization <img src="../../docs/resources/spo/SPO-logo.png" width="60" height="60" style="vertical-align: middle; margin-left: 10px; position: relative; top: -5px;">
An automated prompt engineering tool for Large Language Models (LLMs), designed for universal domain adaptation.
A next-generation prompt engineering system implementing **Self-Supervised Prompt Optimization (SPO)**. Achieves state-of-the-art performance with 17.8-90.9Γ higher cost efficiency than conventional methods. π
<p align="center">
<a href=""><img src="../../docs/resources/spo/SPO-method.png" alt="Framework of SPO" title="Framework of SPO <sub>1</sub>" width="80%"></a>
</p>
## β¨ Core Advantages
- πΈ **Ultra-Low Cost** - _$0.15 per task optimization_
- π·οΈ **Zero Supervision** - _No ground truth/human feedback required_
- β‘ **Universal Adaptation** - _Closed & open-ended tasks supported_
- π **Self-Evolving** - _Auto-optimization via LLM-as-judge mechanism_
[Read our paper on arXiv](https://arxiv.org/pdf/2502.06855)
## π Experiment
### Closed Tasks
<p align="center">
<a href=""><img src="../../docs/resources/spo/SPO-closed_task_table.png" alt="SPO closed task table" title="SPO closed task table <sub>1</sub>" width="80%"></a>
<a href=""><img src="../../docs/resources/spo/SPO-closed_task_figure.png" alt="SPO closed task figure" title="SPO closed task figure <sub>1</sub>" width="80%"></a>
</p>
*SPO demonstrates superior cost efficiency, requiring only 1.1% to 5.6% of the cost of state-of-the-art methods while maintaining competitive performance.*
### Open-ended Tasks
<p align="center">
<a href=""><img src="../../docs/resources/spo/SPO-open_ended_task_figure.png" alt="Open-ended task figure" title="Open-ended task figure <sub>1</sub>" width="80%"></a>
</p>
*SPO significantly improves model performance across all model configurations in open-ended tasks.*
## π Quick Start
### 1. Configure Your API Key βοΈ
Configure LLM parameters in `config/config2.yaml` (see `examples/spo/config2.example.yaml` for reference)
### 2. Define Your Iteration template π
Create a Iteration template file `metagpt/ext/spo/settings/task_name.yaml`:
```yaml
prompt: |
Please solve the following problem.
requirements: |
...
count: None
faq:
- question: |
...
answer: |
...
- question: |
...
answer: |
...
```
Notes:
- `prompt`: Initial prompt for iteration
- `requirements`: Desired effects/outcomes (e.g., generate more thinking, use more humorous language)
- `count`: Target word count for the generated prompt (e.g., 50). Set to None for no limit
- `faq`: QA pairs used for iteration, can include appropriate number of pairs (typically 3)
- `question`: Questions from the dataset used for iteration
- `answer`: Corresponding answers. Can contain desired thinking patterns or responses instead of actual answers, or can be left empty. See `metagpt/ext/spo/settings/Navigate.yaml` for reference
### 3. Implement the PromptOptimizer π§
You have three ways to run the PromptOptimizer:
#### Option 1: Python Script
```python
from metagpt.ext.spo.components.optimizer import PromptOptimizer
from metagpt.ext.spo.utils.llm_client import SPO_LLM
if __name__ == "__main__":
# Initialize LLM settings
SPO_LLM.initialize(
optimize_kwargs={"model": "claude-3-5-sonnet-20240620", "temperature": 0.7},
evaluate_kwargs={"model": "gpt-4o-mini", "temperature": 0.3},
execute_kwargs={"model": "gpt-4o-mini", "temperature": 0}
)
# Create and run optimizer
optimizer = PromptOptimizer(
optimized_path="workspace", # Output directory
initial_round=1, # Starting round
max_rounds=10, # Maximum optimization rounds
template="Poem.yaml", # Template file
name="Poem", # Project name
)
optimizer.optimize()
```
#### Option 2: Command Line Interface
```bash
python -m examples.spo.optimize
```
Available command line options:
```
--opt-model Model for optimization (default: claude-3-5-sonnet-20240620)
--opt-temp Temperature for optimization (default: 0.7)
--eval-model Model for evaluation (default: gpt-4o-mini)
--eval-temp Temperature for evaluation (default: 0.3)
--exec-model Model for execution (default: gpt-4o-mini)
--exec-temp Temperature for execution (default: 0)
--workspace Output directory path (default: workspace)
--initial-round Initial round number (default: 1)
--max-rounds Maximum number of rounds (default: 10)
--template Template file name (default: Poem.yaml)
--name Project name (default: Poem)
```
For help:
```bash
python -m examples.spo.optimize --help
```
#### Option 3: Streamlit Web Interface
For a more user-friendly experience, you can use the Streamlit web interface to configure and run the optimizer.
First, install Streamlit:
```bash
pip install "streamlit~=1.42.0"
```
Then run the web interface:
```bash
python -m streamlit run metagpt/ext/spo/app.py
```
### 4. View Results
```
workspace
βββ Project_name
βββ prompts
βββ results.json
βββ round_1
β βββ answers.txt
β βββ prompt.txt
βββ round_2
β βββ answers.txt
β βββ prompt.txt
βββ round_3
β βββ answers.txt
β βββ prompt.txt
βββ ...
βββ round_n
βββ answers.txt
βββ prompt.txt
```
- `results.json`: Stores whether each iteration round was judged successful and other related information
- `prompt.txt`: The optimized prompt for the corresponding round
- `answers.txt`: The output results generated using the prompt for the corresponding round
## Citation
If you use SPO in your research, please cite our paper:
```
@misc{xiang2025spo,
title={Self-Supervised Prompt Optimization},
author={Jinyu Xiang and Jiayi Zhang and Zhaoyang Yu and Fengwei Teng and Jinhao Tu and Xinbing Liang and Sirui Hong and Chenglin Wu and Yuyu Luo},
year={2025},
eprint={2502.06855},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2502.06855},
}
``` |