Wendong-Fan commited on
Commit
92ee0c5
·
2 Parent(s): 600bb99 98ef7f1

feat: add mcp sample (#213)

Browse files
README.md CHANGED
@@ -122,7 +122,9 @@ https://private-user-images.githubusercontent.com/55657767/420212194-e813fc05-13
122
  - **Browser Automation**: Utilize the Playwright framework for simulating browser interactions, including scrolling, clicking, input handling, downloading, navigation, and more.
123
  - **Document Parsing**: Extract content from Word, Excel, PDF, and PowerPoint files, converting them into text or Markdown format.
124
  - **Code Execution**: Write and execute Python code using interpreter.
125
- - **Built-in Toolkits**: Access to a comprehensive set of built-in toolkits including ArxivToolkit, AudioAnalysisToolkit, CodeExecutionToolkit, DalleToolkit, DataCommonsToolkit, ExcelToolkit, GitHubToolkit, GoogleMapsToolkit, GoogleScholarToolkit, ImageAnalysisToolkit, MathToolkit, NetworkXToolkit, NotionToolkit, OpenAPIToolkit, RedditToolkit, SearchToolkit, SemanticScholarToolkit, SymPyToolkit, VideoAnalysisToolkit, WeatherToolkit, BrowserToolkit, and many more for specialized tasks.
 
 
126
 
127
  # 🛠️ Installation
128
 
@@ -275,6 +277,23 @@ For more detailed Docker usage instructions, including cross-platform support, o
275
 
276
  # 🚀 Quick Start
277
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
278
  After installation and setting up your environment variables, you can start using OWL right away:
279
 
280
  ```bash
@@ -358,6 +377,14 @@ Here are some tasks you can try with OWL:
358
 
359
  # 🧰 Toolkits and Capabilities
360
 
 
 
 
 
 
 
 
 
361
  > **Important**: Effective use of toolkits requires models with strong tool calling capabilities. For multimodal toolkits (Web, Image, Video), models must also have multimodal understanding abilities.
362
 
363
  OWL supports various toolkits that can be customized by modifying the `tools` list in your script:
 
122
  - **Browser Automation**: Utilize the Playwright framework for simulating browser interactions, including scrolling, clicking, input handling, downloading, navigation, and more.
123
  - **Document Parsing**: Extract content from Word, Excel, PDF, and PowerPoint files, converting them into text or Markdown format.
124
  - **Code Execution**: Write and execute Python code using interpreter.
125
+ - **Built-in Toolkits**: Access to a comprehensive set of built-in toolkits including:
126
+ - **Model Context Protocol (MCP)**: A universal protocol layer that standardizes AI model interactions with various tools and data sources
127
+ - **Core Toolkits**: ArxivToolkit, AudioAnalysisToolkit, CodeExecutionToolkit, DalleToolkit, DataCommonsToolkit, ExcelToolkit, GitHubToolkit, GoogleMapsToolkit, GoogleScholarToolkit, ImageAnalysisToolkit, MathToolkit, NetworkXToolkit, NotionToolkit, OpenAPIToolkit, RedditToolkit, SearchToolkit, SemanticScholarToolkit, SymPyToolkit, VideoAnalysisToolkit, WeatherToolkit, BrowserToolkit, and many more for specialized tasks
128
 
129
  # 🛠️ Installation
130
 
 
277
 
278
  # 🚀 Quick Start
279
 
280
+ ## Try MCP (Model Context Protocol) Integration
281
+
282
+ Experience the power of MCP by running our example that demonstrates multi-agent information retrieval and processing:
283
+
284
+ ```bash
285
+ # Set up MCP servers (one-time setup)
286
+ npx -y @smithery/cli install @wonderwhy-er/desktop-commander --client claude
287
+ npx @wonderwhy-er/desktop-commander setup
288
+
289
+ # Run the MCP example
290
+ python owl/run_mcp.py
291
+ ```
292
+
293
+ This example showcases how OWL agents can seamlessly interact with file systems, web automation, and information retrieval through the MCP protocol. Check out `owl/run_mcp.py` for the full implementation.
294
+
295
+ ## Basic Usage
296
+
297
  After installation and setting up your environment variables, you can start using OWL right away:
298
 
299
  ```bash
 
377
 
378
  # 🧰 Toolkits and Capabilities
379
 
380
+ ## Model Context Protocol (MCP)
381
+
382
+ OWL's MCP integration provides a standardized way for AI models to interact with various tools and data sources:
383
+
384
+ Try our comprehensive MCP example in `owl/run_mcp.py` to see these capabilities in action!
385
+
386
+ ## Available Toolkits
387
+
388
  > **Important**: Effective use of toolkits requires models with strong tool calling capabilities. For multimodal toolkits (Web, Image, Video), models must also have multimodal understanding abilities.
389
 
390
  OWL supports various toolkits that can be customized by modifying the `tools` list in your script:
README_zh.md CHANGED
@@ -105,7 +105,7 @@
105
  </div>
106
 
107
  - **[2025.03.12]**: 在SearchToolkit中添加了Bocha搜索功能,集成了火山引擎模型平台,并更新了Azure和OpenAI Compatible模型的结构化输出和工具调用能力。
108
- - **[2025.03.11]**: 我们添加了 MCPToolkit、FileWriteToolkit 和 TerminalToolkit,增强 OWL Agent的工具调用、文件写入能力和终端命令执行功能。
109
  - **[2025.03.09]**: 我们添加了基于网页的用户界面,使系统交互变得更加简便。
110
  - **[2025.03.07]**: 我们开源了 🦉 OWL 项目的代码库。
111
  - **[2025.03.03]**: OWL 在 GAIA 基准测试中取得 58.18 平均分,在开源框架中排名第一!
@@ -272,6 +272,23 @@ chmod +x build_docker.sh
272
  更多详细的Docker使用说明,包括跨平台支持、优化配置和故障排除,请参阅 [DOCKER_README.md](.container/DOCKER_README.md)
273
 
274
  # 🚀 快速开始
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
275
 
276
  运行以下示例:
277
 
@@ -352,6 +369,14 @@ OWL 将自动调用与文档相关的工具来处理文件并提取答案。
352
 
353
  # 🧰 工具包与功能
354
 
 
 
 
 
 
 
 
 
355
  > **重要提示**:有效使用工具包需要具备强大工具调用能力的模型。对于多模态工具包(Web、图像、视频),模型还必须具备多模态理解能力。
356
 
357
  OWL支持多种工具包,可通过修改脚本中的`tools`列表进行自定义:
 
105
  </div>
106
 
107
  - **[2025.03.12]**: 在SearchToolkit中添加了Bocha搜索功能,集成了火山引擎模型平台,并更新了Azure和OpenAI Compatible模型的结构化输出和工具调用能力。
108
+ - **[2025.03.11]**: 我们添加了 MCPToolkit、FileWriteToolkit 和 TerminalToolkit,增强了 OWL Agent 的 MCP(模型上下文协议)集成、文件写入能力和终端命令执行功能。MCP 作为一个通用协议层,标准化了 AI 模型与各种数据源和工具的交互方式。
109
  - **[2025.03.09]**: 我们添加了基于网页的用户界面,使系统交互变得更加简便。
110
  - **[2025.03.07]**: 我们开源了 🦉 OWL 项目的代码库。
111
  - **[2025.03.03]**: OWL 在 GAIA 基准测试中取得 58.18 平均分,在开源框架中排名第一!
 
272
  更多详细的Docker使用说明,包括跨平台支持、优化配置和故障排除,请参阅 [DOCKER_README.md](.container/DOCKER_README.md)
273
 
274
  # 🚀 快速开始
275
+
276
+ ## 尝试 MCP(模型上下文协议)集成
277
+
278
+ 体验 MCP 的强大功能,运行我们的示例来展示多智能体信息检索和处理:
279
+
280
+ ```bash
281
+ # 设置 MCP 服务器(仅需一次性设置)
282
+ npx -y @smithery/cli install @wonderwhy-er/desktop-commander --client claude
283
+ npx @wonderwhy-er/desktop-commander setup
284
+
285
+ # 运行 MCP 示例
286
+ python owl/run_mcp.py
287
+ ```
288
+
289
+ 这个示例展示了 OWL 智能体如何通过 MCP 协议无缝地与文件系统、网页自动化和信息检索进行交互。查看 `owl/run_mcp.py` 了解完整实现。
290
+
291
+ ## 基本用法
292
 
293
  运行以下示例:
294
 
 
369
 
370
  # 🧰 工具包与功能
371
 
372
+ ## 模型上下文协议(MCP)
373
+
374
+ OWL 的 MCP 集成为 AI 模型与各种工具和数据源的交互提供了标准化的方式。
375
+
376
+ 查看我们的综合示例 `owl/run_mcp.py` 来体验这些功能!
377
+
378
+ ## 可用工具包
379
+
380
  > **重要提示**:有效使用工具包需要具备强大工具调用能力的模型。对于多模态工具包(Web、图像、视频),模型还必须具备多模态理解能力。
381
 
382
  OWL支持多种工具包,可通过修改脚本中的`tools`列表进行自定义:
owl/mcp_servers_config.json ADDED
@@ -0,0 +1,16 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "mcpServers": {
3
+ "desktop-commander": {
4
+ "command": "npx",
5
+ "args": [
6
+ "-y",
7
+ "@wonderwhy-er/desktop-commander"
8
+ ]
9
+ },
10
+ "playwright": {
11
+ "command": "npx",
12
+ "args": ["-y", "@executeautomation/playwright-mcp-server"]
13
+ }
14
+ }
15
+ }
16
+
owl/run_deepseek_zh.py CHANGED
@@ -31,7 +31,7 @@ from camel.toolkits import (
31
  from camel.types import ModelPlatformType, ModelType
32
 
33
 
34
- from utils import OwlRolePlaying, run_society, DocumentProcessingToolkit
35
 
36
  from camel.logger import set_log_level
37
 
@@ -99,9 +99,7 @@ def construct_society(question: str) -> OwlRolePlaying:
99
  def main():
100
  r"""Main function to run the OWL system with an example question."""
101
  # Example research question
102
- question = (
103
- "搜索OWL项目最近的新闻并生成一篇报告,最后保存到本地。"
104
- )
105
 
106
  # Construct and run the society
107
  society = construct_society(question)
 
31
  from camel.types import ModelPlatformType, ModelType
32
 
33
 
34
+ from utils import OwlRolePlaying, run_society
35
 
36
  from camel.logger import set_log_level
37
 
 
99
  def main():
100
  r"""Main function to run the OWL system with an example question."""
101
  # Example research question
102
+ question = "搜索OWL项目最近的新闻并生成一篇报告,最后保存到本地。"
 
 
103
 
104
  # Construct and run the society
105
  society = construct_society(question)
owl/run_mcp.py ADDED
@@ -0,0 +1,184 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # ========= Copyright 2023-2024 @ CAMEL-AI.org. All Rights Reserved. =========
2
+ # Licensed under the Apache License, Version 2.0 (the "License");
3
+ # you may not use this file except in compliance with the License.
4
+ # You may obtain a copy of the License at
5
+ #
6
+ # http://www.apache.org/licenses/LICENSE-2.0
7
+ #
8
+ # Unless required by applicable law or agreed to in writing, software
9
+ # distributed under the License is distributed on an "AS IS" BASIS,
10
+ # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
11
+ # See the License for the specific language governing permissions and
12
+ # limitations under the License.
13
+ # ========= Copyright 2023-2024 @ CAMEL-AI.org. All Rights Reserved. =========
14
+ """MCP Multi-Agent System Example
15
+
16
+ This example demonstrates how to use MCP (Model Context Protocol) with CAMEL agents
17
+ for advanced information retrieval and processing tasks.
18
+
19
+ Environment Setup:
20
+ 1. Configure the required dependencies of owl library
21
+ Refer to: https://github.com/camel-ai/owl for installation guide
22
+
23
+ 2. MCP Server Setup:
24
+
25
+ 2.1 MCP Desktop Commander (File System Service):
26
+ Prerequisites: Node.js and npm
27
+ ```bash
28
+ # Install MCP service
29
+ npx -y @smithery/cli install @wonderwhy-er/desktop-commander --client claude
30
+ npx @wonderwhy-er/desktop-commander setup
31
+
32
+ # Configure in owl/mcp_servers_config.json:
33
+ {
34
+ "desktop-commander": {
35
+ "command": "npx",
36
+ "args": [
37
+ "-y",
38
+ "@wonderwhy-er/desktop-commander"
39
+ ]
40
+ }
41
+ }
42
+ ```
43
+
44
+ 2.2 MCP Playwright Service:
45
+ ```bash
46
+ # Install MCP service
47
+ npm install -g @executeautomation/playwright-mcp-server
48
+ npx playwright install-deps
49
+
50
+ # Configure in mcp_servers_config.json:
51
+ {
52
+ "mcpServers": {
53
+ "playwright": {
54
+ "command": "npx",
55
+ "args": ["-y", "@executeautomation/playwright-mcp-server"]
56
+ }
57
+ }
58
+ }
59
+ ```
60
+
61
+ 2.3 MCP Fetch Service (Optional - for better retrieval):
62
+ ```bash
63
+ # Install MCP service
64
+ pip install mcp-server-fetch
65
+
66
+ # Configure in mcp_servers_config.json:
67
+ {
68
+ "mcpServers": {
69
+ "fetch": {
70
+ "command": "python",
71
+ "args": ["-m", "mcp_server_fetch"]
72
+ }
73
+ }
74
+ }
75
+ ```
76
+
77
+ Usage:
78
+ 1. Ensure all MCP servers are properly configured in mcp_servers_config.json
79
+ 2. Run this script to create a multi-agent system that can:
80
+ - Access and manipulate files through MCP Desktop Commander
81
+ - Perform web automation tasks using Playwright
82
+ - Process and generate information using GPT-4o
83
+ - Fetch web content (if fetch service is configured)
84
+ 3. The system will execute the specified task while maintaining security through
85
+ controlled access
86
+
87
+ Note:
88
+ - All file operations are restricted to configured directories
89
+ - System uses GPT-4o for both user and assistant roles
90
+ - Supports asynchronous operations for efficient processing
91
+ """
92
+
93
+ import asyncio
94
+ from pathlib import Path
95
+ from typing import List
96
+
97
+ from dotenv import load_dotenv
98
+
99
+ from camel.models import ModelFactory
100
+ from camel.toolkits import FunctionTool
101
+ from camel.types import ModelPlatformType, ModelType
102
+ from camel.logger import set_log_level
103
+ from camel.toolkits import MCPToolkit
104
+
105
+ from utils.enhanced_role_playing import OwlRolePlaying, run_society
106
+
107
+
108
+ load_dotenv()
109
+ set_log_level(level="DEBUG")
110
+
111
+
112
+ async def construct_society(
113
+ question: str,
114
+ tools: List[FunctionTool],
115
+ ) -> OwlRolePlaying:
116
+ r"""build a multi-agent OwlRolePlaying instance.
117
+
118
+ Args:
119
+ question (str): The question to ask.
120
+ tools (List[FunctionTool]): The MCP tools to use.
121
+ """
122
+ models = {
123
+ "user": ModelFactory.create(
124
+ model_platform=ModelPlatformType.OPENAI,
125
+ model_type=ModelType.GPT_4O,
126
+ model_config_dict={"temperature": 0},
127
+ ),
128
+ "assistant": ModelFactory.create(
129
+ model_platform=ModelPlatformType.OPENAI,
130
+ model_type=ModelType.GPT_4O,
131
+ model_config_dict={"temperature": 0},
132
+ ),
133
+ }
134
+
135
+ user_agent_kwargs = {"model": models["user"]}
136
+ assistant_agent_kwargs = {
137
+ "model": models["assistant"],
138
+ "tools": tools,
139
+ }
140
+
141
+ task_kwargs = {
142
+ "task_prompt": question,
143
+ "with_task_specify": False,
144
+ }
145
+
146
+ society = OwlRolePlaying(
147
+ **task_kwargs,
148
+ user_role_name="user",
149
+ user_agent_kwargs=user_agent_kwargs,
150
+ assistant_role_name="assistant",
151
+ assistant_agent_kwargs=assistant_agent_kwargs,
152
+ )
153
+ return society
154
+
155
+
156
+ async def main():
157
+ config_path = Path(__file__).parent / "mcp_servers_config.json"
158
+ mcp_toolkit = MCPToolkit(config_path=str(config_path))
159
+
160
+ try:
161
+ await mcp_toolkit.connect()
162
+
163
+ question = (
164
+ "I'd like a academic report about Andrew Ng, including his research "
165
+ "direction, published papers (At least 3), institutions, etc."
166
+ "Then organize the report in Markdown format and save it to my desktop"
167
+ )
168
+
169
+ # Connect to all MCP toolkits
170
+ tools = [*mcp_toolkit.get_tools()]
171
+ society = await construct_society(question, tools)
172
+ answer, chat_history, token_count = await run_society(society)
173
+ print(f"\033[94mAnswer: {answer}\033[0m")
174
+
175
+ finally:
176
+ # Make sure to disconnect safely after all operations are completed.
177
+ try:
178
+ await mcp_toolkit.disconnect()
179
+ except Exception:
180
+ print("Disconnect failed")
181
+
182
+
183
+ if __name__ == "__main__":
184
+ asyncio.run(main())
owl/run_terminal.py CHANGED
@@ -18,7 +18,7 @@ from camel.toolkits import (
18
  SearchToolkit,
19
  BrowserToolkit,
20
  FileWriteToolkit,
21
- TerminalToolkit
22
  )
23
  from camel.types import ModelPlatformType, ModelType
24
  from camel.logger import set_log_level
@@ -30,6 +30,7 @@ set_log_level(level="DEBUG")
30
  # Get current script directory
31
  base_dir = os.path.dirname(os.path.abspath(__file__))
32
 
 
33
  def construct_society(question: str) -> OwlRolePlaying:
34
  r"""Construct a society of agents based on the given question.
35
 
@@ -113,7 +114,9 @@ def main():
113
  answer, chat_history, token_count = run_society(society)
114
 
115
  # Output the result
116
- print(f"\033[94mAnswer: {answer}\nChat History: {chat_history}\ntoken_count:{token_count}\033[0m")
 
 
117
 
118
 
119
  if __name__ == "__main__":
 
18
  SearchToolkit,
19
  BrowserToolkit,
20
  FileWriteToolkit,
21
+ TerminalToolkit,
22
  )
23
  from camel.types import ModelPlatformType, ModelType
24
  from camel.logger import set_log_level
 
30
  # Get current script directory
31
  base_dir = os.path.dirname(os.path.abspath(__file__))
32
 
33
+
34
  def construct_society(question: str) -> OwlRolePlaying:
35
  r"""Construct a society of agents based on the given question.
36
 
 
114
  answer, chat_history, token_count = run_society(society)
115
 
116
  # Output the result
117
+ print(
118
+ f"\033[94mAnswer: {answer}\nChat History: {chat_history}\ntoken_count:{token_count}\033[0m"
119
+ )
120
 
121
 
122
  if __name__ == "__main__":
owl/run_terminal_zh.py CHANGED
@@ -12,13 +12,13 @@
12
  # limitations under the License.
13
  # ========= Copyright 2023-2024 @ CAMEL-AI.org. All Rights Reserved. =========
14
  from dotenv import load_dotenv
15
-
16
  from camel.models import ModelFactory
17
  from camel.toolkits import (
18
  SearchToolkit,
19
  BrowserToolkit,
20
  FileWriteToolkit,
21
- TerminalToolkit
22
  )
23
  from camel.types import ModelPlatformType, ModelType
24
  from camel.logger import set_log_level
@@ -27,10 +27,12 @@ from utils import OwlRolePlaying, run_society
27
 
28
  load_dotenv()
29
  set_log_level(level="DEBUG")
30
- import os
 
31
  # Get current script directory
32
  base_dir = os.path.dirname(os.path.abspath(__file__))
33
 
 
34
  def construct_society(question: str) -> OwlRolePlaying:
35
  r"""Construct a society of agents based on the given question.
36
 
@@ -112,7 +114,9 @@ def main():
112
  answer, chat_history, token_count = run_society(society)
113
 
114
  # Output the result
115
- print(f"\033[94mAnswer: {answer}\nChat History: {chat_history}\ntoken_count:{token_count}\033[0m")
 
 
116
 
117
 
118
  if __name__ == "__main__":
 
12
  # limitations under the License.
13
  # ========= Copyright 2023-2024 @ CAMEL-AI.org. All Rights Reserved. =========
14
  from dotenv import load_dotenv
15
+ import os
16
  from camel.models import ModelFactory
17
  from camel.toolkits import (
18
  SearchToolkit,
19
  BrowserToolkit,
20
  FileWriteToolkit,
21
+ TerminalToolkit,
22
  )
23
  from camel.types import ModelPlatformType, ModelType
24
  from camel.logger import set_log_level
 
27
 
28
  load_dotenv()
29
  set_log_level(level="DEBUG")
30
+
31
+
32
  # Get current script directory
33
  base_dir = os.path.dirname(os.path.abspath(__file__))
34
 
35
+
36
  def construct_society(question: str) -> OwlRolePlaying:
37
  r"""Construct a society of agents based on the given question.
38
 
 
114
  answer, chat_history, token_count = run_society(society)
115
 
116
  # Output the result
117
+ print(
118
+ f"\033[94mAnswer: {answer}\nChat History: {chat_history}\ntoken_count:{token_count}\033[0m"
119
+ )
120
 
121
 
122
  if __name__ == "__main__":
owl/utils/enhanced_role_playing.py CHANGED
@@ -152,7 +152,7 @@ Please note that the task may be very complicated. Do not attempt to solve the t
152
  Here are some tips that will help you to give more valuable instructions about our task to me:
153
  <tips>
154
  - I have various tools to use, such as search toolkit, web browser simulation toolkit, document relevant toolkit, code execution toolkit, etc. Thus, You must think how human will solve the task step-by-step, and give me instructions just like that. For example, one may first use google search to get some initial information and the target url, then retrieve the content of the url, or do some web browser interaction to find the answer.
155
- - Although the task is complex, the answer does exist. If you cant find the answer using the current scheme, try to re-plan and use other ways to find the answer, e.g. using other tools or methods that can achieve similar results.
156
  - Always remind me to verify my final answer about the overall task. This work can be done by using multiple tools(e.g., screenshots, webpage analysis, etc.), or something else.
157
  - If I have written code, please remind me to run the code and get the result.
158
  - Search results typically do not provide precise answers. It is not likely to find the answer directly using search toolkit only, the search query should be concise and focuses on finding sources rather than direct answers, as it always need to use other tools to further process the url, e.g. interact with the webpage, extract webpage content, etc.
@@ -281,6 +281,74 @@ Please note that our overall task may be very complicated. Here are some tips th
281
  ),
282
  )
283
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
284
 
285
  class OwlGAIARolePlaying(OwlRolePlaying):
286
  def __init__(self, **kwargs):
@@ -369,23 +437,23 @@ class OwlGAIARolePlaying(OwlRolePlaying):
369
  )
370
 
371
 
372
- def run_society(
373
- society: RolePlaying, round_limit: int = 15
 
374
  ) -> Tuple[str, List[dict], dict]:
375
  overall_completion_token_count = 0
376
  overall_prompt_token_count = 0
377
 
378
  chat_history = []
379
  init_prompt = """
380
- Now please give me instructions to solve over overall task step by step. If the task requires some specific knowledge, please instruct me to use tools to complete the task.
381
- """
382
  input_msg = society.init_chat(init_prompt)
383
  for _round in range(round_limit):
384
- assistant_response, user_response = society.step(input_msg)
385
- overall_completion_token_count += (
386
- assistant_response.info["usage"]["completion_tokens"]
387
- + user_response.info["usage"]["completion_tokens"]
388
- )
389
  overall_prompt_token_count += (
390
  assistant_response.info["usage"]["prompt_tokens"]
391
  + user_response.info["usage"]["prompt_tokens"]
 
152
  Here are some tips that will help you to give more valuable instructions about our task to me:
153
  <tips>
154
  - I have various tools to use, such as search toolkit, web browser simulation toolkit, document relevant toolkit, code execution toolkit, etc. Thus, You must think how human will solve the task step-by-step, and give me instructions just like that. For example, one may first use google search to get some initial information and the target url, then retrieve the content of the url, or do some web browser interaction to find the answer.
155
+ - Although the task is complex, the answer does exist. If you can't find the answer using the current scheme, try to re-plan and use other ways to find the answer, e.g. using other tools or methods that can achieve similar results.
156
  - Always remind me to verify my final answer about the overall task. This work can be done by using multiple tools(e.g., screenshots, webpage analysis, etc.), or something else.
157
  - If I have written code, please remind me to run the code and get the result.
158
  - Search results typically do not provide precise answers. It is not likely to find the answer directly using search toolkit only, the search query should be concise and focuses on finding sources rather than direct answers, as it always need to use other tools to further process the url, e.g. interact with the webpage, extract webpage content, etc.
 
281
  ),
282
  )
283
 
284
+ async def astep(
285
+ self, assistant_msg: BaseMessage
286
+ ) -> Tuple[ChatAgentResponse, ChatAgentResponse]:
287
+ user_response = await self.user_agent.astep(assistant_msg)
288
+ if user_response.terminated or user_response.msgs is None:
289
+ return (
290
+ ChatAgentResponse(msgs=[], terminated=False, info={}),
291
+ ChatAgentResponse(
292
+ msgs=[],
293
+ terminated=user_response.terminated,
294
+ info=user_response.info,
295
+ ),
296
+ )
297
+ user_msg = self._reduce_message_options(user_response.msgs)
298
+
299
+ modified_user_msg = deepcopy(user_msg)
300
+
301
+ if "TASK_DONE" not in user_msg.content:
302
+ modified_user_msg.content += f"""\n
303
+ Here are auxiliary information about the overall task, which may help you understand the intent of the current task:
304
+ <auxiliary_information>
305
+ {self.task_prompt}
306
+ </auxiliary_information>
307
+ If there are available tools and you want to call them, never say 'I will ...', but first call the tool and reply based on tool call's result, and tell me which tool you have called.
308
+ """
309
+
310
+ else:
311
+ # The task is done, and the assistant agent need to give the final answer about the original task
312
+ modified_user_msg.content += f"""\n
313
+ Now please make a final answer of the original task based on our conversation : <task>{self.task_prompt}</task>
314
+ """
315
+
316
+ assistant_response = await self.assistant_agent.astep(user_msg)
317
+ if assistant_response.terminated or assistant_response.msgs is None:
318
+ return (
319
+ ChatAgentResponse(
320
+ msgs=[],
321
+ terminated=assistant_response.terminated,
322
+ info=assistant_response.info,
323
+ ),
324
+ ChatAgentResponse(
325
+ msgs=[user_msg], terminated=False, info=user_response.info
326
+ ),
327
+ )
328
+ assistant_msg = self._reduce_message_options(assistant_response.msgs)
329
+
330
+ modified_assistant_msg = deepcopy(assistant_msg)
331
+ if "TASK_DONE" not in user_msg.content:
332
+ modified_assistant_msg.content += f"""\n
333
+ Provide me with the next instruction and input (if needed) based on my response and our current task: <task>{self.task_prompt}</task>
334
+ Before producing the final answer, please check whether I have rechecked the final answer using different toolkit as much as possible. If not, please remind me to do that.
335
+ If I have written codes, remind me to run the codes.
336
+ If you think our task is done, reply with `TASK_DONE` to end our conversation.
337
+ """
338
+
339
+ return (
340
+ ChatAgentResponse(
341
+ msgs=[assistant_msg],
342
+ terminated=assistant_response.terminated,
343
+ info=assistant_response.info,
344
+ ),
345
+ ChatAgentResponse(
346
+ msgs=[user_msg],
347
+ terminated=user_response.terminated,
348
+ info=user_response.info,
349
+ ),
350
+ )
351
+
352
 
353
  class OwlGAIARolePlaying(OwlRolePlaying):
354
  def __init__(self, **kwargs):
 
437
  )
438
 
439
 
440
+ async def run_society(
441
+ society: OwlRolePlaying,
442
+ round_limit: int = 15,
443
  ) -> Tuple[str, List[dict], dict]:
444
  overall_completion_token_count = 0
445
  overall_prompt_token_count = 0
446
 
447
  chat_history = []
448
  init_prompt = """
449
+ Now please give me instructions to solve over overall task step by step. If the task requires some specific knowledge, please instruct me to use tools to complete the task.
450
+ """
451
  input_msg = society.init_chat(init_prompt)
452
  for _round in range(round_limit):
453
+ assistant_response, user_response = await society.astep(input_msg)
454
+ overall_prompt_token_count += assistant_response.info["usage"][
455
+ "completion_tokens"
456
+ ]
 
457
  overall_prompt_token_count += (
458
  assistant_response.info["usage"]["prompt_tokens"]
459
  + user_response.info["usage"]["prompt_tokens"]
owl/utils/gaia.py CHANGED
@@ -191,7 +191,9 @@ class GAIABenchmark(BaseBenchmark):
191
  except Exception as e:
192
  logger.warning(e)
193
  # raise FileNotFoundError(f"{self.save_to} does not exist.")
194
- datas = [data for data in datas if not self._check_task_completed(data["task_id"])]
 
 
195
  logger.info(f"Number of tasks to be processed: {len(datas)}")
196
  # Process tasks
197
  for task in tqdm(datas, desc="Running"):
 
191
  except Exception as e:
192
  logger.warning(e)
193
  # raise FileNotFoundError(f"{self.save_to} does not exist.")
194
+ datas = [
195
+ data for data in datas if not self._check_task_completed(data["task_id"])
196
+ ]
197
  logger.info(f"Number of tasks to be processed: {len(datas)}")
198
  # Process tasks
199
  for task in tqdm(datas, desc="Running"):
pyproject.toml CHANGED
@@ -21,10 +21,12 @@ keywords = [
21
  "learning-systems"
22
  ]
23
  dependencies = [
24
- "camel-ai[all]==0.2.27",
25
  "chunkr-ai>=0.0.41",
26
  "docx2markdown>=0.1.1",
27
  "gradio>=3.50.2",
 
 
28
  ]
29
 
30
  [project.urls]
 
21
  "learning-systems"
22
  ]
23
  dependencies = [
24
+ "camel-ai[all]==0.2.28",
25
  "chunkr-ai>=0.0.41",
26
  "docx2markdown>=0.1.1",
27
  "gradio>=3.50.2",
28
+ "mcp-simple-arxiv==0.2.2",
29
+ "mcp-server-fetch==2025.1.17",
30
  ]
31
 
32
  [project.urls]
requirements.txt CHANGED
@@ -1,4 +1,4 @@
1
- camel-ai[all]==0.2.27
2
  chunkr-ai>=0.0.41
3
  docx2markdown>=0.1.1
4
  gradio>=3.50.2
 
1
+ camel-ai[all]==0.2.28
2
  chunkr-ai>=0.0.41
3
  docx2markdown>=0.1.1
4
  gradio>=3.50.2
run_app.py CHANGED
@@ -22,7 +22,8 @@ import os
22
  import sys
23
  from pathlib import Path
24
 
25
- os.environ['PYTHONIOENCODING'] = 'utf-8'
 
26
 
27
  def main():
28
  """Main function to launch the OWL Intelligent Assistant Platform"""
 
22
  import sys
23
  from pathlib import Path
24
 
25
+ os.environ["PYTHONIOENCODING"] = "utf-8"
26
+
27
 
28
  def main():
29
  """Main function to launch the OWL Intelligent Assistant Platform"""
run_app_zh.py CHANGED
@@ -22,7 +22,8 @@ import os
22
  import sys
23
  from pathlib import Path
24
 
25
- os.environ['PYTHONIOENCODING'] = 'utf-8'
 
26
 
27
  def main():
28
  """主函数,启动OWL智能助手运行平台"""
 
22
  import sys
23
  from pathlib import Path
24
 
25
+ os.environ["PYTHONIOENCODING"] = "utf-8"
26
+
27
 
28
  def main():
29
  """主函数,启动OWL智能助手运行平台"""
uv.lock CHANGED
@@ -482,7 +482,7 @@ wheels = [
482
 
483
  [[package]]
484
  name = "camel-ai"
485
- version = "0.2.27"
486
  source = { registry = "https://pypi.org/simple" }
487
  dependencies = [
488
  { name = "colorama" },
@@ -499,9 +499,9 @@ dependencies = [
499
  { name = "pyyaml" },
500
  { name = "tiktoken" },
501
  ]
502
- sdist = { url = "https://files.pythonhosted.org/packages/ff/27/2bce666ae7f7d0db276d037b3afe84a460e782438e5cacc08de20417233b/camel_ai-0.2.27.tar.gz", hash = "sha256:4689245ad48f51e5e602d2651cf463afe212bcf046633a19c2189574c1f3481a", size = 441363 }
503
  wheels = [
504
- { url = "https://files.pythonhosted.org/packages/b0/fa/94f5b41cb6babc81aac00494b170ec2bea058b6c00f477ceb3e886c49177/camel_ai-0.2.27-py3-none-any.whl", hash = "sha256:c4a6597791faf2f2161c56c2579e60850557b126135b29af77ebd08fa0774e0b", size = 746387 },
505
  ]
506
 
507
  [package.optional-dependencies]
@@ -2685,6 +2685,19 @@ wheels = [
2685
  { url = "https://files.pythonhosted.org/packages/42/d7/1ec15b46af6af88f19b8e5ffea08fa375d433c998b8a7639e76935c14f1f/markdown_it_py-3.0.0-py3-none-any.whl", hash = "sha256:355216845c60bd96232cd8d8c40e8f9765cc86f46880e43a8fd22dc1a1a8cab1", size = 87528 },
2686
  ]
2687
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2688
  [[package]]
2689
  name = "markupsafe"
2690
  version = "2.1.5"
@@ -2806,6 +2819,38 @@ wheels = [
2806
  { url = "https://files.pythonhosted.org/packages/d0/d2/a9e87b506b2094f5aa9becc1af5178842701b27217fa43877353da2577e3/mcp-1.3.0-py3-none-any.whl", hash = "sha256:2829d67ce339a249f803f22eba5e90385eafcac45c94b00cab6cef7e8f217211", size = 70672 },
2807
  ]
2808
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2809
  [[package]]
2810
  name = "mdurl"
2811
  version = "0.1.2"
@@ -3571,14 +3616,18 @@ dependencies = [
3571
  { name = "chunkr-ai" },
3572
  { name = "docx2markdown" },
3573
  { name = "gradio" },
 
 
3574
  ]
3575
 
3576
  [package.metadata]
3577
  requires-dist = [
3578
- { name = "camel-ai", extras = ["all"], specifier = "==0.2.27" },
3579
  { name = "chunkr-ai", specifier = ">=0.0.41" },
3580
  { name = "docx2markdown", specifier = ">=0.1.1" },
3581
  { name = "gradio", specifier = ">=3.50.2" },
 
 
3582
  ]
3583
 
3584
  [[package]]
@@ -3962,6 +4011,15 @@ wheels = [
3962
  { url = "https://files.pythonhosted.org/packages/b5/35/6c4c6fc8774a9e3629cd750dc24a7a4fb090a25ccd5c3246d127b70f9e22/propcache-0.3.0-py3-none-any.whl", hash = "sha256:67dda3c7325691c2081510e92c561f465ba61b975f481735aefdfc845d2cd043", size = 12101 },
3963
  ]
3964
 
 
 
 
 
 
 
 
 
 
3965
  [[package]]
3966
  name = "proto-plus"
3967
  version = "1.26.0"
@@ -4673,6 +4731,21 @@ wheels = [
4673
  { url = "https://files.pythonhosted.org/packages/09/f6/fa777f336629aee8938f3d5c95c09df38459d4eadbdbe34642889857fb6a/rapidfuzz-3.12.2-pp311-pypy311_pp73-win_amd64.whl", hash = "sha256:54bb69ebe5ca0bd7527357e348f16a4c0c52fe0c2fcc8a041010467dcb8385f7", size = 1555000 },
4674
  ]
4675
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
4676
  [[package]]
4677
  name = "redis"
4678
  version = "5.2.1"
 
482
 
483
  [[package]]
484
  name = "camel-ai"
485
+ version = "0.2.28"
486
  source = { registry = "https://pypi.org/simple" }
487
  dependencies = [
488
  { name = "colorama" },
 
499
  { name = "pyyaml" },
500
  { name = "tiktoken" },
501
  ]
502
+ sdist = { url = "https://files.pythonhosted.org/packages/6a/3b/7f350ae3c5bf42263688d3a69333e3908af4d45ce8f5f838af634a2720b3/camel_ai-0.2.28.tar.gz", hash = "sha256:f47e12bdf59df6e789db4587f0c5bd0adf43b2029d6be1bfcc31bfd41cab9d9f", size = 443082 }
503
  wheels = [
504
+ { url = "https://files.pythonhosted.org/packages/5d/27/8a6e97f660354ce03413872268c7f4a40ceefdf39b20f161cb7f672dc67c/camel_ai-0.2.28-py3-none-any.whl", hash = "sha256:079e7e905a36b64be47a6a27ad4b99d21ca0403b27027a4d777744968a22040a", size = 748237 },
505
  ]
506
 
507
  [package.optional-dependencies]
 
2685
  { url = "https://files.pythonhosted.org/packages/42/d7/1ec15b46af6af88f19b8e5ffea08fa375d433c998b8a7639e76935c14f1f/markdown_it_py-3.0.0-py3-none-any.whl", hash = "sha256:355216845c60bd96232cd8d8c40e8f9765cc86f46880e43a8fd22dc1a1a8cab1", size = 87528 },
2686
  ]
2687
 
2688
+ [[package]]
2689
+ name = "markdownify"
2690
+ version = "1.1.0"
2691
+ source = { registry = "https://pypi.org/simple" }
2692
+ dependencies = [
2693
+ { name = "beautifulsoup4" },
2694
+ { name = "six" },
2695
+ ]
2696
+ sdist = { url = "https://files.pythonhosted.org/packages/2f/78/c48fed23c7aebc2c16049062e72de1da3220c274de59d28c942acdc9ffb2/markdownify-1.1.0.tar.gz", hash = "sha256:449c0bbbf1401c5112379619524f33b63490a8fa479456d41de9dc9e37560ebd", size = 17127 }
2697
+ wheels = [
2698
+ { url = "https://files.pythonhosted.org/packages/64/11/b751af7ad41b254a802cf52f7bc1fca7cabe2388132f2ce60a1a6b9b9622/markdownify-1.1.0-py3-none-any.whl", hash = "sha256:32a5a08e9af02c8a6528942224c91b933b4bd2c7d078f9012943776fc313eeef", size = 13901 },
2699
+ ]
2700
+
2701
  [[package]]
2702
  name = "markupsafe"
2703
  version = "2.1.5"
 
2819
  { url = "https://files.pythonhosted.org/packages/d0/d2/a9e87b506b2094f5aa9becc1af5178842701b27217fa43877353da2577e3/mcp-1.3.0-py3-none-any.whl", hash = "sha256:2829d67ce339a249f803f22eba5e90385eafcac45c94b00cab6cef7e8f217211", size = 70672 },
2820
  ]
2821
 
2822
+ [[package]]
2823
+ name = "mcp-server-fetch"
2824
+ version = "2025.1.17"
2825
+ source = { registry = "https://pypi.org/simple" }
2826
+ dependencies = [
2827
+ { name = "markdownify" },
2828
+ { name = "mcp" },
2829
+ { name = "protego" },
2830
+ { name = "pydantic" },
2831
+ { name = "readabilipy" },
2832
+ { name = "requests" },
2833
+ ]
2834
+ sdist = { url = "https://files.pythonhosted.org/packages/99/76/204ac83afe2000b1513b4741229586128361f376fab03832695e0179104d/mcp_server_fetch-2025.1.17.tar.gz", hash = "sha256:aa3a5dee358651103477bc121b98ada18a5c35840c56e4016cc3b40e7df1aa7d", size = 43468 }
2835
+ wheels = [
2836
+ { url = "https://files.pythonhosted.org/packages/d7/34/c0dce3415b627f763a9b7a0202a6a0672446b49f5ca04827340c28d75c63/mcp_server_fetch-2025.1.17-py3-none-any.whl", hash = "sha256:53c4967572464c6329824c9b05cdfa5fe214004d577ae8700fdb04203844be52", size = 7991 },
2837
+ ]
2838
+
2839
+ [[package]]
2840
+ name = "mcp-simple-arxiv"
2841
+ version = "0.2.2"
2842
+ source = { registry = "https://pypi.org/simple" }
2843
+ dependencies = [
2844
+ { name = "beautifulsoup4" },
2845
+ { name = "feedparser" },
2846
+ { name = "httpx" },
2847
+ { name = "mcp" },
2848
+ ]
2849
+ sdist = { url = "https://files.pythonhosted.org/packages/20/d3/d47bfce067ea85bc73154d8299549f84455e601f699fcff513f9d44cef0d/mcp_simple_arxiv-0.2.2.tar.gz", hash = "sha256:e27cfd58a470dcec7d733bd09b4219daddbdc3475a6d256e246a114e5b94e817", size = 12100 }
2850
+ wheels = [
2851
+ { url = "https://files.pythonhosted.org/packages/07/4e/6646a0004fc85b0c1df6e662db42f76fe5a0412179b7f65c066d7804370a/mcp_simple_arxiv-0.2.2-py3-none-any.whl", hash = "sha256:fcf607303c074ae5e88337b5bf3ea52cd781081f49ddf8fa0898eb3b8420dccb", size = 13686 },
2852
+ ]
2853
+
2854
  [[package]]
2855
  name = "mdurl"
2856
  version = "0.1.2"
 
3616
  { name = "chunkr-ai" },
3617
  { name = "docx2markdown" },
3618
  { name = "gradio" },
3619
+ { name = "mcp-server-fetch" },
3620
+ { name = "mcp-simple-arxiv" },
3621
  ]
3622
 
3623
  [package.metadata]
3624
  requires-dist = [
3625
+ { name = "camel-ai", extras = ["all"], specifier = "==0.2.28" },
3626
  { name = "chunkr-ai", specifier = ">=0.0.41" },
3627
  { name = "docx2markdown", specifier = ">=0.1.1" },
3628
  { name = "gradio", specifier = ">=3.50.2" },
3629
+ { name = "mcp-server-fetch", specifier = "==2025.1.17" },
3630
+ { name = "mcp-simple-arxiv", specifier = "==0.2.2" },
3631
  ]
3632
 
3633
  [[package]]
 
4011
  { url = "https://files.pythonhosted.org/packages/b5/35/6c4c6fc8774a9e3629cd750dc24a7a4fb090a25ccd5c3246d127b70f9e22/propcache-0.3.0-py3-none-any.whl", hash = "sha256:67dda3c7325691c2081510e92c561f465ba61b975f481735aefdfc845d2cd043", size = 12101 },
4012
  ]
4013
 
4014
+ [[package]]
4015
+ name = "protego"
4016
+ version = "0.4.0"
4017
+ source = { registry = "https://pypi.org/simple" }
4018
+ sdist = { url = "https://files.pythonhosted.org/packages/4e/6b/84e878d0567dfc11538bad6ce2595cee7ae0c47cf6bf7293683c9ec78ef8/protego-0.4.0.tar.gz", hash = "sha256:93a5e662b61399a0e1f208a324f2c6ea95b23ee39e6cbf2c96246da4a656c2f6", size = 3246425 }
4019
+ wheels = [
4020
+ { url = "https://files.pythonhosted.org/packages/d9/fd/8d84d75832b0983cecf3aff7ae48362fe96fc8ab6ebca9dcf3cefd87e79c/Protego-0.4.0-py2.py3-none-any.whl", hash = "sha256:37640bc0ebe37572d624453a21381d05e9d86e44f89ff1e81794d185a0491666", size = 8553 },
4021
+ ]
4022
+
4023
  [[package]]
4024
  name = "proto-plus"
4025
  version = "1.26.0"
 
4731
  { url = "https://files.pythonhosted.org/packages/09/f6/fa777f336629aee8938f3d5c95c09df38459d4eadbdbe34642889857fb6a/rapidfuzz-3.12.2-pp311-pypy311_pp73-win_amd64.whl", hash = "sha256:54bb69ebe5ca0bd7527357e348f16a4c0c52fe0c2fcc8a041010467dcb8385f7", size = 1555000 },
4732
  ]
4733
 
4734
+ [[package]]
4735
+ name = "readabilipy"
4736
+ version = "0.3.0"
4737
+ source = { registry = "https://pypi.org/simple" }
4738
+ dependencies = [
4739
+ { name = "beautifulsoup4" },
4740
+ { name = "html5lib" },
4741
+ { name = "lxml" },
4742
+ { name = "regex" },
4743
+ ]
4744
+ sdist = { url = "https://files.pythonhosted.org/packages/b8/e4/260a202516886c2e0cc6e6ae96d1f491792d829098886d9529a2439fbe8e/readabilipy-0.3.0.tar.gz", hash = "sha256:e13313771216953935ac031db4234bdb9725413534bfb3c19dbd6caab0887ae0", size = 35491 }
4745
+ wheels = [
4746
+ { url = "https://files.pythonhosted.org/packages/dd/46/8a640c6de1a6c6af971f858b2fb178ca5e1db91f223d8ba5f40efe1491e5/readabilipy-0.3.0-py3-none-any.whl", hash = "sha256:d106da0fad11d5fdfcde21f5c5385556bfa8ff0258483037d39ea6b1d6db3943", size = 22158 },
4747
+ ]
4748
+
4749
  [[package]]
4750
  name = "redis"
4751
  version = "5.2.1"