lazychih114 commited on
Commit
5ec6899
·
1 Parent(s): d2703ad

update gaia

Browse files
Files changed (2) hide show
  1. README.md +7 -2
  2. README_zh.md +8 -2
README.md CHANGED
@@ -172,9 +172,14 @@ Example tasks you can try:
172
 
173
  # 🧪 Experiments
174
 
175
- We provided a script to reproduce the results on GAIA.
176
- You can check the `run_gaia_roleplaying.py` file and run the following command:
177
 
 
 
 
 
 
 
178
  ```bash
179
  python run_gaia_roleplaying.py
180
  ```
 
172
 
173
  # 🧪 Experiments
174
 
175
+ To reproduce OWL's GAIA benchmark score of 58.18:
 
176
 
177
+ 1. Switch to the `gaia58.18` branch:
178
+ ```bash
179
+ git checkout gaia58.18
180
+ ```
181
+
182
+ 1. Run the evaluation script:
183
  ```bash
184
  python run_gaia_roleplaying.py
185
  ```
README_zh.md CHANGED
@@ -164,9 +164,15 @@ logger.success(f"Answer: {answer}")
164
  - "总结这篇研究论文的主要观点:[论文URL]"
165
  # 🧪 实验
166
 
167
- 我们提供了一个脚本用于复现 GAIA 上的实验结果。
168
- 你可以查看 `run_gaia_roleplaying.py` 文件,并运行以下命令:
169
 
 
 
 
 
 
 
170
  ```bash
171
  python run_gaia_roleplaying.py
172
  ```
 
164
  - "总结这篇研究论文的主要观点:[论文URL]"
165
  # 🧪 实验
166
 
167
+ 我们提供了一个脚本用于复现 GAIA 上的实验结果。
168
+ 要复现我们在 GAIA 基准测试中获得的 58.18 分:
169
 
170
+ 1. 切换到 `gaia58.18` 分支:
171
+ ```bash
172
+ git checkout gaia58.18
173
+ ```
174
+
175
+ 2. 运行评估脚本:
176
  ```bash
177
  python run_gaia_roleplaying.py
178
  ```