Wendong-Fan commited on
Commit
63e3fc9
·
1 Parent(s): 0c5d50e

update readme

Browse files
Files changed (2) hide show
  1. README.md +22 -12
  2. README_zh.md +22 -12
README.md CHANGED
@@ -219,11 +219,7 @@ Alternatively, you can set environment variables directly in your terminal:
219
 
220
  > **Note**: Environment variables set directly in the terminal will only persist for the current session.
221
 
222
- ### Additional Models
223
 
224
- For information on configuring other AI models beyond OpenAI, please refer to our [CAMEL models documentation](https://docs.camel-ai.org/key_modules/models.html#supported-model-platforms-in-camel).
225
-
226
- > **Note**: For optimal performance, we strongly recommend using OpenAI models. Our experiments show that other models may result in significantly lower performance on complex tasks and benchmarks.
227
 
228
  ## **Running with Docker**
229
 
@@ -265,11 +261,19 @@ python owl/run.py
265
 
266
  ## Running with Different Models
267
 
268
- ### Additional Models
 
 
 
 
 
 
269
 
270
- For information on configuring other AI models beyond OpenAI, please refer to our [CAMEL models documentation](https://docs.camel-ai.org/key_modules/models.html#supported-model-platforms-in-camel).
271
 
272
- OWL supports various LLM backends. You can use the following scripts to run with different models:
 
 
273
 
274
  ```bash
275
  # Run with Qwen model
@@ -325,6 +329,8 @@ Example tasks you can try:
325
 
326
  # 🧰 Configuring Toolkits
327
 
 
 
328
  OWL supports various toolkits that can be customized by modifying the `tools` list in your script:
329
 
330
  ```python
@@ -347,11 +353,15 @@ tools = [
347
  ## Available Toolkits
348
 
349
  Key toolkits include:
350
- - **WebToolkit**: Browser automation
351
- - **VideoAnalysisToolkit**: Video processing
352
- - **AudioAnalysisToolkit**: Audio processing
353
- - **CodeExecutionToolkit**: Python code execution
354
- - **ImageAnalysisToolkit**: Image analysis
 
 
 
 
355
  - **SearchToolkit**: Web searches (Google, DuckDuckGo, Wikipedia)
356
  - **DocumentProcessingToolkit**: Document parsing (PDF, DOCX, etc.)
357
 
 
219
 
220
  > **Note**: Environment variables set directly in the terminal will only persist for the current session.
221
 
 
222
 
 
 
 
223
 
224
  ## **Running with Docker**
225
 
 
261
 
262
  ## Running with Different Models
263
 
264
+ ### Model Requirements
265
+
266
+ - **Tool Calling**: OWL requires models with robust tool calling capabilities to interact with various toolkits. Models must be able to understand tool descriptions, generate appropriate tool calls, and process tool outputs.
267
+
268
+ - **Multimodal Understanding**: For tasks involving web interaction, image analysis, or video processing, models with multimodal capabilities are required to interpret visual content and context.
269
+
270
+ #### Supported Models
271
 
272
+ For information on configuring AI models, please refer to our [CAMEL models documentation](https://docs.camel-ai.org/key_modules/models.html#supported-model-platforms-in-camel).
273
 
274
+ > **Note**: For optimal performance, we strongly recommend using OpenAI models (GPT-4 or later versions). Our experiments show that other models may result in significantly lower performance on complex tasks and benchmarks, especially those requiring advanced multi-modal understanding and tool use.
275
+
276
+ OWL supports various LLM backends, though capabilities may vary depending on the model's tool calling and multimodal abilities. You can use the following scripts to run with different models:
277
 
278
  ```bash
279
  # Run with Qwen model
 
329
 
330
  # 🧰 Configuring Toolkits
331
 
332
+ > **Important**: Effective use of toolkits requires models with strong tool calling capabilities. For multimodal toolkits (Web, Image, Video), models must also have multimodal understanding abilities.
333
+
334
  OWL supports various toolkits that can be customized by modifying the `tools` list in your script:
335
 
336
  ```python
 
353
  ## Available Toolkits
354
 
355
  Key toolkits include:
356
+
357
+ ### Multimodal Toolkits (Require multimodal model capabilities)
358
+ - **WebToolkit**: Browser automation for web interaction and navigation
359
+ - **VideoAnalysisToolkit**: Video processing and content analysis
360
+ - **ImageAnalysisToolkit**: Image analysis and interpretation
361
+
362
+ ### Text-Based Toolkits
363
+ - **AudioAnalysisToolkit**: Audio processing (requires OpenAI API)
364
+ - **CodeExecutionToolkit**: Python code execution and evaluation
365
  - **SearchToolkit**: Web searches (Google, DuckDuckGo, Wikipedia)
366
  - **DocumentProcessingToolkit**: Document parsing (PDF, DOCX, etc.)
367
 
README_zh.md CHANGED
@@ -218,10 +218,6 @@ OWL 需要各种 API 密钥来与不同的服务进行交互。`owl/.env_templat
218
 
219
  > **注意**:直接在终端中设置的环境变量仅在当前会话中有效。
220
 
221
- ### 其他模型
222
-
223
- 有关配置 OpenAI 以外的其他 AI 模型的信息,请参阅我们的 [CAMEL 模型文档](https://docs.camel-ai.org/key_modules/models.html#supported-model-platforms-in-camel)。
224
-
225
  ## **使用Docker运行**
226
 
227
  如果您希望使用Docker运行OWL项目,我们提供了完整的Docker支持:
@@ -267,11 +263,19 @@ python owl/run_mini.py
267
 
268
  ## 使用不同的模型
269
 
270
- ### 其他模型
 
 
 
 
 
 
271
 
272
- 有关配置 OpenAI 以外的其他 AI 模型的信息,请参阅我们的 [CAMEL 模型文档](https://docs.camel-ai.org/key_modules/models.html#supported-model-platforms-in-camel)。
273
 
274
- OWL 支持多种 LLM 后端。您可以使用以下脚本来运行不同的模型:
 
 
275
 
276
  ```bash
277
  # 使用 Qwen 模型运行
@@ -321,6 +325,8 @@ OWL 将自动调用与文档相关的工具来处理文件并提取答案。
321
 
322
  # 🧰 配置工具包
323
 
 
 
324
  OWL支持多种工具包,可通过修改脚本中的`tools`列表进行自定义:
325
 
326
  ```python
@@ -343,11 +349,15 @@ tools = [
343
  ## 主要工具包
344
 
345
  关键工具包包括:
346
- - **WebToolkit**:浏览器自动化
347
- - **VideoAnalysisToolkit**:视频处理
348
- - **AudioAnalysisToolkit**:音频处理
349
- - **CodeExecutionToolkit**:Python代码执行
350
- - **ImageAnalysisToolkit**:图像分析
 
 
 
 
351
  - **SearchToolkit**:网络搜索(Google、DuckDuckGo、维基百科)
352
  - **DocumentProcessingToolkit**:文档解析(PDF、DOCX等)
353
 
 
218
 
219
  > **注意**:直接在终端中设置的环境变量仅在当前会话中有效。
220
 
 
 
 
 
221
  ## **使用Docker运行**
222
 
223
  如果您希望使用Docker运行OWL项目,我们提供了完整的Docker支持:
 
263
 
264
  ## 使用不同的模型
265
 
266
+ ### 模型要求
267
+
268
+ - **工具调用能力**:OWL 需要具有强大工具调用能力的模型来与各种工具包交互。模型必须能够理解工具描述、生成适当的工具调用,并处理工具输出。
269
+
270
+ - **多模态理解能力**:对于涉及网页交互、图像分析或视频处理的任务,需要具备多模态能力的模型来解释视觉内容和上下文。
271
+
272
+ #### 支持的模型
273
 
274
+ 有关配置模型的信息,请参阅我们的 [CAMEL 模型文档](https://docs.camel-ai.org/key_modules/models.html#supported-model-platforms-in-camel)。
275
 
276
+ > **注意**:为获得最佳性能,我们强烈推荐使用 OpenAI 模型(GPT-4 或更高版本)。我们的实验表明,其他模型在复杂任务和基准测试上可能表现明显较差,尤其是那些需要多模态理解和工具使用的任务。
277
+
278
+ OWL 支持多种 LLM 后端,但功能可能因模型的工具调用和多模态能力而异。您可以使用以下脚本来运行不同的模型:
279
 
280
  ```bash
281
  # 使用 Qwen 模型运行
 
325
 
326
  # 🧰 配置工具包
327
 
328
+ > **重要提示**:有效使用工具包需要具备强大工具调用能力的模型。对于多模态工具包(Web、图像、视频),模型还必须具备多模态理解能力。
329
+
330
  OWL支持多种工具包,可通过修改脚本中的`tools`列表进行自定义:
331
 
332
  ```python
 
349
  ## 主要工具包
350
 
351
  关键工具包包括:
352
+
353
+ ### 多模态工具包(需要模型具备多模态能力)
354
+ - **WebToolkit**:浏览器自动化,用于网页交互和导航
355
+ - **VideoAnalysisToolkit**:视频处理和内容分析
356
+ - **ImageAnalysisToolkit**:图像分析和解释
357
+
358
+ ### 基于文本的工具包
359
+ - **AudioAnalysisToolkit**:音频处理(需要 OpenAI API)
360
+ - **CodeExecutionToolkit**:Python 代码执行和评估
361
  - **SearchToolkit**:网络搜索(Google、DuckDuckGo、维基百科)
362
  - **DocumentProcessingToolkit**:文档解析(PDF、DOCX等)
363