|
# SPO | Self-Supervised Prompt Optimization <img src="../../docs/resources/spo/SPO-logo.png" width="60" height="60" style="vertical-align: middle; margin-left: 10px; position: relative; top: -5px;"> |
|
|
|
|
|
An automated prompt engineering tool for Large Language Models (LLMs), designed for universal domain adaptation. |
|
|
|
A next-generation prompt engineering system implementing **Self-Supervised Prompt Optimization (SPO)**. Achieves state-of-the-art performance with 17.8-90.9Γ higher cost efficiency than conventional methods. π |
|
|
|
<p align="center"> |
|
<a href=""><img src="../../docs/resources/spo/SPO-method.png" alt="Framework of SPO" title="Framework of SPO <sub>1</sub>" width="80%"></a> |
|
</p> |
|
|
|
## β¨ Core Advantages |
|
|
|
- πΈ **Ultra-Low Cost** - _$0.15 per task optimization_ |
|
- π·οΈ **Zero Supervision** - _No ground truth/human feedback required_ |
|
- β‘ **Universal Adaptation** - _Closed & open-ended tasks supported_ |
|
- π **Self-Evolving** - _Auto-optimization via LLM-as-judge mechanism_ |
|
|
|
[Read our paper on arXiv](https://arxiv.org/pdf/2502.06855) |
|
|
|
## π Experiment |
|
|
|
### Closed Tasks |
|
<p align="center"> |
|
<a href=""><img src="../../docs/resources/spo/SPO-closed_task_table.png" alt="SPO closed task table" title="SPO closed task table <sub>1</sub>" width="80%"></a> |
|
<a href=""><img src="../../docs/resources/spo/SPO-closed_task_figure.png" alt="SPO closed task figure" title="SPO closed task figure <sub>1</sub>" width="80%"></a> |
|
</p> |
|
|
|
*SPO demonstrates superior cost efficiency, requiring only 1.1% to 5.6% of the cost of state-of-the-art methods while maintaining competitive performance.* |
|
|
|
### Open-ended Tasks |
|
<p align="center"> |
|
<a href=""><img src="../../docs/resources/spo/SPO-open_ended_task_figure.png" alt="Open-ended task figure" title="Open-ended task figure <sub>1</sub>" width="80%"></a> |
|
</p> |
|
|
|
*SPO significantly improves model performance across all model configurations in open-ended tasks.* |
|
|
|
## π Quick Start |
|
|
|
### 1. Configure Your API Key βοΈ |
|
|
|
Configure LLM parameters in `config/config2.yaml` (see `examples/spo/config2.example.yaml` for reference) |
|
### 2. Define Your Iteration template π |
|
|
|
Create a Iteration template file `metagpt/ext/spo/settings/task_name.yaml`: |
|
```yaml |
|
prompt: | |
|
Please solve the following problem. |
|
|
|
requirements: | |
|
... |
|
|
|
count: None |
|
|
|
faq: |
|
- question: | |
|
... |
|
answer: | |
|
... |
|
|
|
- question: | |
|
... |
|
answer: | |
|
... |
|
``` |
|
|
|
Notes: |
|
- `prompt`: Initial prompt for iteration |
|
- `requirements`: Desired effects/outcomes (e.g., generate more thinking, use more humorous language) |
|
- `count`: Target word count for the generated prompt (e.g., 50). Set to None for no limit |
|
- `faq`: QA pairs used for iteration, can include appropriate number of pairs (typically 3) |
|
- `question`: Questions from the dataset used for iteration |
|
- `answer`: Corresponding answers. Can contain desired thinking patterns or responses instead of actual answers, or can be left empty. See `metagpt/ext/spo/settings/Navigate.yaml` for reference |
|
|
|
### 3. Implement the PromptOptimizer π§ |
|
|
|
You have three ways to run the PromptOptimizer: |
|
|
|
#### Option 1: Python Script |
|
|
|
```python |
|
from metagpt.ext.spo.components.optimizer import PromptOptimizer |
|
from metagpt.ext.spo.utils.llm_client import SPO_LLM |
|
|
|
if __name__ == "__main__": |
|
# Initialize LLM settings |
|
SPO_LLM.initialize( |
|
optimize_kwargs={"model": "claude-3-5-sonnet-20240620", "temperature": 0.7}, |
|
evaluate_kwargs={"model": "gpt-4o-mini", "temperature": 0.3}, |
|
execute_kwargs={"model": "gpt-4o-mini", "temperature": 0} |
|
) |
|
|
|
# Create and run optimizer |
|
optimizer = PromptOptimizer( |
|
optimized_path="workspace", # Output directory |
|
initial_round=1, # Starting round |
|
max_rounds=10, # Maximum optimization rounds |
|
template="Poem.yaml", # Template file |
|
name="Poem", # Project name |
|
) |
|
|
|
optimizer.optimize() |
|
``` |
|
|
|
#### Option 2: Command Line Interface |
|
|
|
```bash |
|
python -m examples.spo.optimize |
|
``` |
|
|
|
Available command line options: |
|
``` |
|
--opt-model Model for optimization (default: claude-3-5-sonnet-20240620) |
|
--opt-temp Temperature for optimization (default: 0.7) |
|
--eval-model Model for evaluation (default: gpt-4o-mini) |
|
--eval-temp Temperature for evaluation (default: 0.3) |
|
--exec-model Model for execution (default: gpt-4o-mini) |
|
--exec-temp Temperature for execution (default: 0) |
|
--workspace Output directory path (default: workspace) |
|
--initial-round Initial round number (default: 1) |
|
--max-rounds Maximum number of rounds (default: 10) |
|
--template Template file name (default: Poem.yaml) |
|
--name Project name (default: Poem) |
|
``` |
|
|
|
For help: |
|
```bash |
|
python -m examples.spo.optimize --help |
|
``` |
|
|
|
#### Option 3: Streamlit Web Interface |
|
|
|
For a more user-friendly experience, you can use the Streamlit web interface to configure and run the optimizer. |
|
|
|
First, install Streamlit: |
|
```bash |
|
pip install "streamlit~=1.42.0" |
|
``` |
|
|
|
Then run the web interface: |
|
```bash |
|
python -m streamlit run metagpt/ext/spo/app.py |
|
``` |
|
|
|
### 4. View Results |
|
``` |
|
workspace |
|
βββ Project_name |
|
βββ prompts |
|
βββ results.json |
|
βββ round_1 |
|
β βββ answers.txt |
|
β βββ prompt.txt |
|
βββ round_2 |
|
β βββ answers.txt |
|
β βββ prompt.txt |
|
βββ round_3 |
|
β βββ answers.txt |
|
β βββ prompt.txt |
|
βββ ... |
|
βββ round_n |
|
βββ answers.txt |
|
βββ prompt.txt |
|
``` |
|
|
|
- `results.json`: Stores whether each iteration round was judged successful and other related information |
|
- `prompt.txt`: The optimized prompt for the corresponding round |
|
- `answers.txt`: The output results generated using the prompt for the corresponding round |
|
|
|
## Citation |
|
|
|
If you use SPO in your research, please cite our paper: |
|
|
|
``` |
|
@misc{xiang2025spo, |
|
title={Self-Supervised Prompt Optimization}, |
|
author={Jinyu Xiang and Jiayi Zhang and Zhaoyang Yu and Fengwei Teng and Jinhao Tu and Xinbing Liang and Sirui Hong and Chenglin Wu and Yuyu Luo}, |
|
year={2025}, |
|
eprint={2502.06855}, |
|
archivePrefix={arXiv}, |
|
primaryClass={cs.CL}, |
|
url={https://arxiv.org/abs/2502.06855}, |
|
} |
|
``` |