Dixing (Dex) Xu commited on
Commit
5c7fa16
·
unverified ·
1 Parent(s): 5ddb3df

Feat: Add WebUI (please merge with squash) (#27)

Browse files

* :lipstick: Add webUI for aideml

* add gradio webui
* update requirements
* update gitignore

* :lipstick: Update webUI and use logs to display results

* :lipstick: Add graio theme and strealit app

* :lipstick: Update streamlit webui

* :lipstick: fix issues for serialization on config display

* :lipstick: Fix results at bottom issue

* :lipstick: Update the streamlit webui theme

* :lipstick: Update the streamlit app theme

* :fire: Remove gradio and use streamlit

* :rotating_light: add linter workflow and fix issues

* Add github action for linter
* Fix current linter issues

* :construction_worker: Add github templates

* Add issue templates
* Add pr template

* :rotating_light: Add black formatter (#25)

* :rotating_light: Add black formatter

* Add black to linter
* Fix the formatting

* :rotating_light: Update black linter to see the suggestion change

* :lipstick: Add webUI for aideml

* add gradio webui
* update requirements
* update gitignore

* :lipstick: Update streamlit app

* :rotating_light: fix linter issue

* :construction_worker: Fix styling issues for tree viz

* :construction_worker: Finalize the theme

* :recycle: Refactor webui code

* :memo: Update README

* :rotating_light: fix linter

* :bug: fix conflict

* :heavy_plus_sign: Add dependecy

.gitignore CHANGED
@@ -167,4 +167,4 @@ logs
167
  .trunk
168
 
169
  .gradio/
170
- .ruff_cache/
 
167
  .trunk
168
 
169
  .gradio/
170
+ .ruff_cache/
.streamlit/config.toml ADDED
@@ -0,0 +1,17 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [theme]
2
+ # Primary colors
3
+ primaryColor="#0D0F18" # --wecopink: 343 98% 63%
4
+ backgroundColor="#F0EFE9" # --background: 49 10% 94%
5
+ secondaryBackgroundColor="#FFFFFF" # --card: 60 33.3% 98%
6
+ textColor="#0A0A0A" # --primary: 0 0% 17%
7
+
8
+ # Font
9
+ font="sans serif"
10
+
11
+ [ui]
12
+ hideTopBar = true
13
+
14
+ [client]
15
+ toolbarMode = "minimal"
16
+ showErrorDetails = true
17
+ showSidebarNavigation = false
README.md CHANGED
@@ -11,6 +11,7 @@ AIDE is an LLM agent that generates solutions for machine learning tasks just fr
11
  AIDE is the state-of-the-art agent on OpenAI's [MLE-bench](https://arxiv.org/pdf/2410.07095), a benchmark composed of 75 Kaggle machine learning tasks, where we achieved four times more medals compared to the runner-up agent architecture.
12
 
13
  In our own benchmark composed of over 60 Kaggle data science competitions, AIDE demonstrated impressive performance, surpassing 50% of Kaggle participants on average (see our [technical report](https://www.weco.ai/blog/technical-report) for details).
 
14
  More specifically, AIDE has the following features:
15
 
16
  1. **Instruct with Natural Language**: Describe your problem or additional requirements and expert insights, all in natural language.
@@ -18,7 +19,7 @@ More specifically, AIDE has the following features:
18
  3. **Iterative Optimization**: AIDE iteratively runs, debugs, evaluates, and improves the ML code, all by itself.
19
  4. **Visualization**: We also provide tools to visualize the solution tree produced by AIDE for a better understanding of its experimentation process. This gives you insights not only about what works but also what doesn't.
20
 
21
- # How to use AIDE?
22
 
23
  ## Setup
24
 
@@ -38,7 +39,7 @@ export OPENAI_API_KEY=<your API key>
38
  export ANTHROPIC_API_KEY=<your API key>
39
  ```
40
 
41
- ## Running AIDE via the command line
42
 
43
  To run AIDE:
44
 
@@ -54,9 +55,9 @@ aide data_dir="example_tasks/house_prices" goal="Predict the sales price for eac
54
 
55
  Options:
56
 
57
- - `data_dir` (required): a directory containing all the data relevant for your task (`.csv` files, images, etc.).
58
- - `goal`: describe what you want the models to predict in your task, for example, "Build a timeseries forcasting model for bitcoin close price" or "Predict sales price for houses".
59
- - `eval`: the evaluation metric used to evaluate the ML models for the task (e.g., accuracy, F1, Root-Mean-Squared-Error, etc.)
60
 
61
  Alternatively, you can provide the entire task description as a `desc_str` string, or write it in a plaintext file and pass its path as `desc_file` ([example file](aide/example_tasks/house_prices.md)).
62
 
@@ -66,19 +67,19 @@ aide data_dir="my_data_dir" desc_file="my_task_description.txt"
66
 
67
  The result of the run will be stored in the `logs` directory.
68
 
69
- - `logs/<experiment-id>/best_solution.py`: Python code of _best solution_ according to the validation metric
70
- - `logs/<experiment-id>/journal.json`: a JSON file containing the metadata of the experiment runs, including all the code generated in intermediate steps, plan, evaluation results, etc.
71
- - `logs/<experiment-id>/tree_plot.html`: you can open it in your browser. It contains visualization of solution tree, which details the experimentation process of finding and optimizing ML code. You can explore and interact with the tree visualization to view what plan and code AIDE comes up with in each step.
72
 
73
  The `workspaces` directory will contain all the files and data that the agent generated.
74
 
75
  ### Advanced Usage
76
 
77
- To further customize the behaviour of AIDE, some useful options might be:
78
 
79
- - `agent.code.model=...` to configure which model the agent should use for coding (default is `gpt-4-turbo`)
80
- - `agent.steps=...` to configure how many improvement iterations the agent should run (default is 20)
81
- - `agent.search.num_drafts=...` to configure the number of initial drafts the agent should generate (default is 5)
82
 
83
  You can check the [`config.yaml`](aide/utils/config.yaml) file for more options.
84
 
@@ -88,23 +89,73 @@ AIDE supports using local LLMs through OpenAI-compatible APIs. Here's how to set
88
 
89
  1. Set up a local LLM server with an OpenAI-compatible API endpoint. You can use:
90
  - [Ollama](https://github.com/ollama/ollama)
91
- - or similar solutions
92
 
93
  2. Configure your environment to use the local endpoint:
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
94
  ```bash
95
- export OPENAI_BASE_URL="http://localhost:11434/v1" # For Ollama
96
- export OPENAI_API_KEY="local-llm" # Can be any string if your local server doesn't require authentication
97
  ```
98
 
99
- 3. Update the model configuration in your AIDE command or config. For example, with Ollama:
 
100
  ```bash
101
- # Example with house prices dataset
102
- aide agent.code.model="qwen2.5" agent.feedback.model="qwen2.5" report.model="qwen2.5" \
103
- data_dir="example_tasks/house_prices" \
104
- goal="Predict the sales price for each house" \
105
- eval="Use the RMSE metric between the logarithm of the predicted and observed values."
106
  ```
107
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
108
  ## Using AIDE in Python
109
 
110
  Using AIDE within your Python script/project is easy. Follow the setup steps above, and then create an AIDE experiment like below and start running:
@@ -113,7 +164,7 @@ Using AIDE within your Python script/project is easy. Follow the setup steps abo
113
  import aide
114
  exp = aide.Experiment(
115
  data_dir="example_tasks/bitcoin_price", # replace this with your own directory
116
- goal="Build a timeseries forcasting model for bitcoin close price.", # replace with your own goal description
117
  eval="RMSLE" # replace with your own evaluation metric
118
  )
119
 
@@ -125,7 +176,7 @@ print(f"Best solution code: {best_solution.code}")
125
 
126
  ## Development
127
 
128
- To install AIDE for development, clone this repository and install it locally.
129
 
130
  ```bash
131
  git clone https://github.com/WecoAI/aideml.git
@@ -133,33 +184,45 @@ cd aideml
133
  pip install -e .
134
  ```
135
 
136
- ## Using AIDE with Docker
137
 
138
- You can also run AIDE using Docker:
139
 
140
- 1. Build the Docker image:
141
  ```bash
142
- docker build -t aide .
 
143
  ```
144
 
145
- 2. Run AIDE with Docker (example with house prices task):
146
- ```bash
147
- # Set custom workspace and logs location (optional)
148
- export WORKSPACE_BASE=$(pwd)/workspaces
149
- export LOGS_DIR=$(pwd)/logs
150
-
151
- docker run -it --rm \
152
- -v "${LOGS_DIR:-$(pwd)/logs}:/app/logs" \
153
- -v "${WORKSPACE_BASE:-$(pwd)/workspaces}:/app/workspaces" \
154
- -v "$(pwd)/aide/example_tasks:/app/data" \
155
- -e OPENAI_API_KEY="your-actual-api-key" \
156
- aide \
157
- data_dir=/app/data/house_prices \
158
- goal="Predict the sales price for each house" \
159
- eval="Use the RMSE metric between the logarithm of the predicted and observed values."
160
- ```
 
 
 
 
 
 
 
 
 
 
 
161
 
162
  You can customize the location of workspaces and logs by setting environment variables before running the container:
 
163
  - `WORKSPACE_BASE`: Sets the base directory for AIDE workspaces (default: `$(pwd)/workspaces`)
164
  - `LOGS_DIR`: Sets the directory for AIDE logs (default: `$(pwd)/logs`)
165
 
 
11
  AIDE is the state-of-the-art agent on OpenAI's [MLE-bench](https://arxiv.org/pdf/2410.07095), a benchmark composed of 75 Kaggle machine learning tasks, where we achieved four times more medals compared to the runner-up agent architecture.
12
 
13
  In our own benchmark composed of over 60 Kaggle data science competitions, AIDE demonstrated impressive performance, surpassing 50% of Kaggle participants on average (see our [technical report](https://www.weco.ai/blog/technical-report) for details).
14
+
15
  More specifically, AIDE has the following features:
16
 
17
  1. **Instruct with Natural Language**: Describe your problem or additional requirements and expert insights, all in natural language.
 
19
  3. **Iterative Optimization**: AIDE iteratively runs, debugs, evaluates, and improves the ML code, all by itself.
20
  4. **Visualization**: We also provide tools to visualize the solution tree produced by AIDE for a better understanding of its experimentation process. This gives you insights not only about what works but also what doesn't.
21
 
22
+ # How to Use AIDE?
23
 
24
  ## Setup
25
 
 
39
  export ANTHROPIC_API_KEY=<your API key>
40
  ```
41
 
42
+ ## Running AIDE via the Command Line
43
 
44
  To run AIDE:
45
 
 
55
 
56
  Options:
57
 
58
+ - `data_dir` (required): A directory containing all the data relevant for your task (`.csv` files, images, etc.).
59
+ - `goal`: Describe what you want the models to predict in your task, for example, "Build a time series forecasting model for bitcoin close price" or "Predict sales price for houses".
60
+ - `eval`: The evaluation metric used to evaluate the ML models for the task (e.g., accuracy, F1, Root-Mean-Squared-Error, etc.).
61
 
62
  Alternatively, you can provide the entire task description as a `desc_str` string, or write it in a plaintext file and pass its path as `desc_file` ([example file](aide/example_tasks/house_prices.md)).
63
 
 
67
 
68
  The result of the run will be stored in the `logs` directory.
69
 
70
+ - `logs/<experiment-id>/best_solution.py`: Python code of the _best solution_ according to the validation metric.
71
+ - `logs/<experiment-id>/journal.json`: A JSON file containing the metadata of the experiment runs, including all the code generated in intermediate steps, plan, evaluation results, etc.
72
+ - `logs/<experiment-id>/tree_plot.html`: You can open it in your browser. It contains a visualization of the solution tree, which details the experimentation process of finding and optimizing ML code. You can explore and interact with the tree visualization to view what plan and code AIDE comes up with in each step.
73
 
74
  The `workspaces` directory will contain all the files and data that the agent generated.
75
 
76
  ### Advanced Usage
77
 
78
+ To further customize the behavior of AIDE, some useful options might be:
79
 
80
+ - `agent.code.model=...` to configure which model the agent should use for coding (default is `gpt-4-turbo`).
81
+ - `agent.steps=...` to configure how many improvement iterations the agent should run (default is 20).
82
+ - `agent.search.num_drafts=...` to configure the number of initial drafts the agent should generate (default is 5).
83
 
84
  You can check the [`config.yaml`](aide/utils/config.yaml) file for more options.
85
 
 
89
 
90
  1. Set up a local LLM server with an OpenAI-compatible API endpoint. You can use:
91
  - [Ollama](https://github.com/ollama/ollama)
92
+ - or similar solutions.
93
 
94
  2. Configure your environment to use the local endpoint:
95
+
96
+ ```bash
97
+ export OPENAI_BASE_URL="http://localhost:11434/v1" # For Ollama
98
+ export OPENAI_API_KEY="local-llm" # Can be any string if your local server doesn't require authentication
99
+ ```
100
+
101
+ 3. Update the model configuration in your AIDE command or config. For example, with Ollama:
102
+
103
+ ```bash
104
+ # Example with house prices dataset
105
+ aide agent.code.model="qwen2.5" agent.feedback.model="qwen2.5" report.model="qwen2.5" \
106
+ data_dir="example_tasks/house_prices" \
107
+ goal="Predict the sales price for each house" \
108
+ eval="Use the RMSE metric between the logarithm of the predicted and observed values."
109
+ ```
110
+
111
+ ## Running AIDE via the Web UI
112
+
113
+ We have developed a user-friendly Web UI using Streamlit to make it even easier to interact with AIDE.
114
+
115
+ ### Prerequisites
116
+
117
+ Ensure you have installed the development version of AIDE and its dependencies as described in the [Development](#development) section.
118
+
119
+ ### Running the Web UI
120
+
121
+ Navigate to the `aide/webui` directory and run the Streamlit application:
122
+
123
  ```bash
124
+ cd aide/webui
125
+ streamlit run app.py
126
  ```
127
 
128
+ Alternatively, you can run it from the root directory:
129
+
130
  ```bash
131
+ streamlit run aide/webui/app.py
 
 
 
 
132
  ```
133
 
134
+ ### Using the Web UI
135
+
136
+ 1. **API Key Configuration**: In the sidebar, input your OpenAI API key or Anthropic API key and click "Save API Keys".
137
+
138
+ 2. **Input Data**:
139
+ - You can either **upload your dataset files** (`.csv`, `.txt`, `.json`, `.md`) using the "Upload Data Files" feature.
140
+ - Or click on "Load Example Experiment" to use the example house prices dataset.
141
+
142
+ 3. **Define Goal and Evaluation Criteria**:
143
+ - In the "Goal" text area, describe what you want the model to achieve (e.g., "Predict the sales price for each house").
144
+ - In the "Evaluation Criteria" text area, specify the evaluation metric (e.g., "Use the RMSE metric between the logarithm of the predicted and observed values.").
145
+
146
+ 4. **Configure Steps**:
147
+ - Use the slider to set the number of steps (iterations) for the experiment.
148
+
149
+ 5. **Run the Experiment**:
150
+ - Click on "Run AIDE" to start the experiment.
151
+ - Progress and status updates will be displayed in the "Results" section.
152
+
153
+ 6. **View Results**:
154
+ - **Tree Visualization**: Explore the solution tree to understand how AIDE experimented and optimized the models.
155
+ - **Best Solution**: View the Python code of the best solution found.
156
+ - **Config**: Review the configuration used for the experiment.
157
+ - **Journal**: Examine the detailed journal entries for each step.
158
+
159
  ## Using AIDE in Python
160
 
161
  Using AIDE within your Python script/project is easy. Follow the setup steps above, and then create an AIDE experiment like below and start running:
 
164
  import aide
165
  exp = aide.Experiment(
166
  data_dir="example_tasks/bitcoin_price", # replace this with your own directory
167
+ goal="Build a time series forecasting model for bitcoin close price.", # replace with your own goal description
168
  eval="RMSLE" # replace with your own evaluation metric
169
  )
170
 
 
176
 
177
  ## Development
178
 
179
+ To install AIDE for development, clone this repository and install it locally:
180
 
181
  ```bash
182
  git clone https://github.com/WecoAI/aideml.git
 
184
  pip install -e .
185
  ```
186
 
187
+ ### Running the Web UI in Development Mode
188
 
189
+ Ensure that you have all the required development dependencies installed. Then, you can run the Web UI as follows:
190
 
 
191
  ```bash
192
+ cd aide/webui
193
+ streamlit run app.py
194
  ```
195
 
196
+ ## Using AIDE with Docker
197
+
198
+ You can also run AIDE using Docker:
199
+
200
+ 1. **Build the Docker Image**:
201
+
202
+ ```bash
203
+ docker build -t aide .
204
+ ```
205
+
206
+ 2. **Run AIDE with Docker** (example with house prices task):
207
+
208
+ ```bash
209
+ # Set custom workspace and logs location (optional)
210
+ export WORKSPACE_BASE=$(pwd)/workspaces
211
+ export LOGS_DIR=$(pwd)/logs
212
+
213
+ docker run -it --rm \
214
+ -v "${LOGS_DIR:-$(pwd)/logs}:/app/logs" \
215
+ -v "${WORKSPACE_BASE:-$(pwd)/workspaces}:/app/workspaces" \
216
+ -v "$(pwd)/aide/example_tasks:/app/data" \
217
+ -e OPENAI_API_KEY="your-actual-api-key" \
218
+ aide \
219
+ data_dir=/app/data/house_prices \
220
+ goal="Predict the sales price for each house" \
221
+ eval="Use the RMSE metric between the logarithm of the predicted and observed values."
222
+ ```
223
 
224
  You can customize the location of workspaces and logs by setting environment variables before running the container:
225
+
226
  - `WORKSPACE_BASE`: Sets the base directory for AIDE workspaces (default: `$(pwd)/workspaces`)
227
  - `LOGS_DIR`: Sets the directory for AIDE logs (default: `$(pwd)/logs`)
228
 
aide/webui/__init__.py ADDED
File without changes
aide/webui/app.py ADDED
@@ -0,0 +1,535 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import streamlit as st
2
+ import streamlit.components.v1 as components
3
+ from pathlib import Path
4
+ import tempfile
5
+ import shutil
6
+ import os
7
+ import json
8
+ from omegaconf import OmegaConf
9
+ from rich.console import Console
10
+ import sys
11
+ from dotenv import load_dotenv
12
+ import logging
13
+ from aide import Experiment
14
+
15
+ # Set up logging configuration
16
+ logging.basicConfig(
17
+ level=logging.INFO,
18
+ format="%(asctime)s - %(name)s - %(levelname)s - %(message)s",
19
+ handlers=[logging.StreamHandler(sys.stderr)],
20
+ )
21
+
22
+ logger = logging.getLogger("aide")
23
+ logger.setLevel(logging.INFO)
24
+
25
+ console = Console(file=sys.stderr)
26
+
27
+
28
+ class WebUI:
29
+ """
30
+ WebUI encapsulates the Streamlit application logic for the AIDE Machine Learning Engineer Agent.
31
+ """
32
+
33
+ def __init__(self):
34
+ """
35
+ Initialize the WebUI with environment variables and session state.
36
+ """
37
+ self.env_vars = self.load_env_variables()
38
+ self.project_root = Path(__file__).parent.parent.parent
39
+ self.config_session_state()
40
+ self.setup_page()
41
+
42
+ @staticmethod
43
+ def load_env_variables():
44
+ """
45
+ Load API keys and environment variables from .env file.
46
+
47
+ Returns:
48
+ dict: Dictionary containing API keys.
49
+ """
50
+ load_dotenv()
51
+ return {
52
+ "openai_key": os.getenv("OPENAI_API_KEY", ""),
53
+ "anthropic_key": os.getenv("ANTHROPIC_API_KEY", ""),
54
+ }
55
+
56
+ @staticmethod
57
+ def config_session_state():
58
+ """
59
+ Configure default values for Streamlit session state.
60
+ """
61
+ if "is_running" not in st.session_state:
62
+ st.session_state.is_running = False
63
+ if "current_step" not in st.session_state:
64
+ st.session_state.current_step = 0
65
+ if "total_steps" not in st.session_state:
66
+ st.session_state.total_steps = 0
67
+ if "progress" not in st.session_state:
68
+ st.session_state.progress = 0
69
+ if "results" not in st.session_state:
70
+ st.session_state.results = None
71
+
72
+ @staticmethod
73
+ def setup_page():
74
+ """
75
+ Set up the Streamlit page configuration and load custom CSS.
76
+ """
77
+ st.set_page_config(
78
+ page_title="AIDE: Machine Learning Engineer Agent",
79
+ layout="wide",
80
+ )
81
+ WebUI.load_css()
82
+
83
+ @staticmethod
84
+ def load_css():
85
+ """
86
+ Load custom CSS styles from 'style.css' file.
87
+ """
88
+ css_file = Path(__file__).parent / "style.css"
89
+ if css_file.exists():
90
+ with open(css_file) as f:
91
+ st.markdown(f"<style>{f.read()}</style>", unsafe_allow_html=True)
92
+ else:
93
+ st.warning(f"CSS file not found at: {css_file}")
94
+
95
+ def run(self):
96
+ """
97
+ Run the main logic of the Streamlit application.
98
+ """
99
+ self.render_sidebar()
100
+ input_col, results_col = st.columns([1, 3])
101
+ with input_col:
102
+ self.render_input_section(results_col)
103
+ with results_col:
104
+ self.render_results_section()
105
+
106
+ def render_sidebar(self):
107
+ """
108
+ Render the sidebar with API key settings.
109
+ """
110
+ with st.sidebar:
111
+ st.header("⚙️ Settings")
112
+ st.markdown(
113
+ "<p style='text-align: center;'>OpenAI API Key</p>",
114
+ unsafe_allow_html=True,
115
+ )
116
+ openai_key = st.text_input(
117
+ "OpenAI API Key",
118
+ value=self.env_vars["openai_key"],
119
+ type="password",
120
+ label_visibility="collapsed",
121
+ )
122
+ st.markdown(
123
+ "<p style='text-align: center;'>Anthropic API Key</p>",
124
+ unsafe_allow_html=True,
125
+ )
126
+ anthropic_key = st.text_input(
127
+ "Anthropic API Key",
128
+ value=self.env_vars["anthropic_key"],
129
+ type="password",
130
+ label_visibility="collapsed",
131
+ )
132
+ if st.button("Save API Keys", use_container_width=True):
133
+ st.session_state.openai_key = openai_key
134
+ st.session_state.anthropic_key = anthropic_key
135
+ st.success("API keys saved!")
136
+
137
+ def render_input_section(self, results_col):
138
+ """
139
+ Render the input section of the application.
140
+
141
+ Args:
142
+ results_col (st.delta_generator.DeltaGenerator): The results column to pass to methods.
143
+ """
144
+ st.header("Input")
145
+ uploaded_files = self.handle_file_upload()
146
+ goal_text, eval_text, num_steps = self.handle_user_inputs()
147
+ if st.button("Run AIDE", type="primary", use_container_width=True):
148
+ with st.spinner("AIDE is running..."):
149
+ results = self.run_aide(
150
+ uploaded_files, goal_text, eval_text, num_steps, results_col
151
+ )
152
+ st.session_state.results = results
153
+
154
+ def handle_file_upload(self):
155
+ """
156
+ Handle file uploads and example file loading.
157
+
158
+ Returns:
159
+ list: List of uploaded or example files.
160
+ """
161
+ if st.button(
162
+ "Load Example Experiment", type="primary", use_container_width=True
163
+ ):
164
+ st.session_state.example_files = self.load_example_files()
165
+
166
+ if st.session_state.get("example_files"):
167
+ st.info("Example files loaded! Click 'Run AIDE' to proceed.")
168
+ with st.expander("View Loaded Files", expanded=False):
169
+ for file in st.session_state.example_files:
170
+ st.text(f"📄 {file['name']}")
171
+ uploaded_files = st.session_state.example_files
172
+ else:
173
+ uploaded_files = st.file_uploader(
174
+ "Upload Data Files",
175
+ accept_multiple_files=True,
176
+ type=["csv", "txt", "json", "md"],
177
+ )
178
+ return uploaded_files
179
+
180
+ def handle_user_inputs(self):
181
+ """
182
+ Handle goal, evaluation criteria, and number of steps inputs.
183
+
184
+ Returns:
185
+ tuple: Goal text, evaluation criteria text, and number of steps.
186
+ """
187
+ goal_text = st.text_area(
188
+ "Goal",
189
+ value=st.session_state.get("goal", ""),
190
+ placeholder="Example: Predict house prices",
191
+ )
192
+ eval_text = st.text_area(
193
+ "Evaluation Criteria",
194
+ value=st.session_state.get("eval", ""),
195
+ placeholder="Example: Use RMSE metric",
196
+ )
197
+ num_steps = st.slider(
198
+ "Number of Steps",
199
+ min_value=1,
200
+ max_value=20,
201
+ value=st.session_state.get("steps", 10),
202
+ )
203
+ return goal_text, eval_text, num_steps
204
+
205
+ @staticmethod
206
+ def load_example_files():
207
+ """
208
+ Load example files from the 'example_tasks/house_prices' directory.
209
+
210
+ Returns:
211
+ list: List of example files with their paths.
212
+ """
213
+ package_root = Path(__file__).parent.parent
214
+ example_dir = package_root / "example_tasks" / "house_prices"
215
+
216
+ if not example_dir.exists():
217
+ st.error(f"Example directory not found at: {example_dir}")
218
+ return []
219
+
220
+ example_files = []
221
+
222
+ for file_path in example_dir.glob("*"):
223
+ if file_path.suffix.lower() in [".csv", ".txt", ".json", ".md"]:
224
+ with tempfile.NamedTemporaryFile(
225
+ delete=False, suffix=file_path.suffix
226
+ ) as tmp_file:
227
+ tmp_file.write(file_path.read_bytes())
228
+ example_files.append(
229
+ {"name": file_path.name, "path": tmp_file.name}
230
+ )
231
+
232
+ if not example_files:
233
+ st.warning("No example files found in the example directory")
234
+
235
+ st.session_state["goal"] = "Predict the sales price for each house"
236
+ st.session_state["eval"] = (
237
+ "Use the RMSE metric between the logarithm of the predicted and observed values."
238
+ )
239
+
240
+ return example_files
241
+
242
+ def run_aide(self, files, goal_text, eval_text, num_steps, results_col):
243
+ """
244
+ Run the AIDE experiment with the provided inputs.
245
+
246
+ Args:
247
+ files (list): List of uploaded or example files.
248
+ goal_text (str): The goal of the experiment.
249
+ eval_text (str): The evaluation criteria.
250
+ num_steps (int): Number of steps to run.
251
+ results_col (st.delta_generator.DeltaGenerator): Results column for displaying progress.
252
+
253
+ Returns:
254
+ dict: Dictionary containing the results of the experiment.
255
+ """
256
+ try:
257
+ self.initialize_run_state(num_steps)
258
+ self.set_api_keys()
259
+
260
+ input_dir = self.prepare_input_directory(files)
261
+ if not input_dir:
262
+ return None
263
+
264
+ experiment = self.initialize_experiment(input_dir, goal_text, eval_text)
265
+ placeholders = self.create_results_placeholders(results_col, experiment)
266
+
267
+ for step in range(num_steps):
268
+ st.session_state.current_step = step + 1
269
+ progress = (step + 1) / num_steps
270
+ self.update_results_placeholders(placeholders, progress)
271
+ experiment.run(steps=1)
272
+
273
+ self.clear_run_state(placeholders)
274
+
275
+ return self.collect_results(experiment)
276
+
277
+ except Exception as e:
278
+ st.session_state.is_running = False
279
+ console.print_exception()
280
+ st.error(f"Error occurred: {str(e)}")
281
+ return None
282
+
283
+ @staticmethod
284
+ def initialize_run_state(num_steps):
285
+ """
286
+ Initialize the running state for the experiment.
287
+
288
+ Args:
289
+ num_steps (int): Total number of steps in the experiment.
290
+ """
291
+ st.session_state.is_running = True
292
+ st.session_state.current_step = 0
293
+ st.session_state.total_steps = num_steps
294
+ st.session_state.progress = 0
295
+
296
+ @staticmethod
297
+ def set_api_keys():
298
+ """
299
+ Set the API keys in the environment variables from the session state.
300
+ """
301
+ if st.session_state.get("openai_key"):
302
+ os.environ["OPENAI_API_KEY"] = st.session_state.openai_key
303
+ if st.session_state.get("anthropic_key"):
304
+ os.environ["ANTHROPIC_API_KEY"] = st.session_state.anthropic_key
305
+
306
+ def prepare_input_directory(self, files):
307
+ """
308
+ Prepare the input directory and handle uploaded files.
309
+
310
+ Args:
311
+ files (list): List of uploaded or example files.
312
+
313
+ Returns:
314
+ Path: The input directory path, or None if files are missing.
315
+ """
316
+ input_dir = self.project_root / "input"
317
+ input_dir.mkdir(parents=True, exist_ok=True)
318
+
319
+ if files:
320
+ for file in files:
321
+ if isinstance(file, dict): # Example files
322
+ shutil.copy2(file["path"], input_dir / file["name"])
323
+ else: # Uploaded files
324
+ with open(input_dir / file.name, "wb") as f:
325
+ f.write(file.getbuffer())
326
+ else:
327
+ st.error("Please upload data files")
328
+ return None
329
+ return input_dir
330
+
331
+ @staticmethod
332
+ def initialize_experiment(input_dir, goal_text, eval_text):
333
+ """
334
+ Initialize the AIDE Experiment.
335
+
336
+ Args:
337
+ input_dir (Path): Path to the input directory.
338
+ goal_text (str): The goal of the experiment.
339
+ eval_text (str): The evaluation criteria.
340
+
341
+ Returns:
342
+ Experiment: The initialized Experiment object.
343
+ """
344
+ experiment = Experiment(data_dir=str(input_dir), goal=goal_text, eval=eval_text)
345
+ return experiment
346
+
347
+ @staticmethod
348
+ def create_results_placeholders(results_col, experiment):
349
+ """
350
+ Create placeholders in the results column for dynamic content.
351
+
352
+ Args:
353
+ results_col (st.delta_generator.DeltaGenerator): The results column.
354
+ experiment (Experiment): The Experiment object.
355
+
356
+ Returns:
357
+ dict: Dictionary of placeholders.
358
+ """
359
+ with results_col:
360
+ status_placeholder = st.empty()
361
+ step_placeholder = st.empty()
362
+ config_title_placeholder = st.empty()
363
+ config_placeholder = st.empty()
364
+ progress_placeholder = st.empty()
365
+
366
+ step_placeholder.markdown(
367
+ f"### 🔥 Running Step {st.session_state.current_step}/{st.session_state.total_steps}"
368
+ )
369
+ config_title_placeholder.markdown("### 📋 Configuration")
370
+ config_placeholder.code(OmegaConf.to_yaml(experiment.cfg), language="yaml")
371
+ progress_placeholder.progress(0)
372
+
373
+ placeholders = {
374
+ "status": status_placeholder,
375
+ "step": step_placeholder,
376
+ "config_title": config_title_placeholder,
377
+ "config": config_placeholder,
378
+ "progress": progress_placeholder,
379
+ }
380
+ return placeholders
381
+
382
+ @staticmethod
383
+ def update_results_placeholders(placeholders, progress):
384
+ """
385
+ Update the placeholders with the current progress.
386
+
387
+ Args:
388
+ placeholders (dict): Dictionary of placeholders.
389
+ progress (float): Current progress value.
390
+ """
391
+ placeholders["step"].markdown(
392
+ f"### 🔥 Running Step {st.session_state.current_step}/{st.session_state.total_steps}"
393
+ )
394
+ placeholders["progress"].progress(progress)
395
+
396
+ @staticmethod
397
+ def clear_run_state(placeholders):
398
+ """
399
+ Clear the running state and placeholders after the experiment.
400
+
401
+ Args:
402
+ placeholders (dict): Dictionary of placeholders.
403
+ """
404
+ st.session_state.is_running = False
405
+ placeholders["status"].empty()
406
+ placeholders["step"].empty()
407
+ placeholders["config_title"].empty()
408
+ placeholders["config"].empty()
409
+ placeholders["progress"].empty()
410
+
411
+ @staticmethod
412
+ def collect_results(experiment):
413
+ """
414
+ Collect the results from the experiment.
415
+
416
+ Args:
417
+ experiment (Experiment): The Experiment object.
418
+
419
+ Returns:
420
+ dict: Dictionary containing the collected results.
421
+ """
422
+ solution_path = experiment.cfg.log_dir / "best_solution.py"
423
+ if solution_path.exists():
424
+ solution = solution_path.read_text()
425
+ else:
426
+ solution = "No solution found"
427
+
428
+ journal_data = [
429
+ {
430
+ "step": node.step,
431
+ "code": str(node.code),
432
+ "metric": str(node.metric.value) if node.metric else None,
433
+ "is_buggy": node.is_buggy,
434
+ }
435
+ for node in experiment.journal.nodes
436
+ ]
437
+
438
+ results = {
439
+ "solution": solution,
440
+ "config": OmegaConf.to_yaml(experiment.cfg),
441
+ "journal": json.dumps(journal_data, indent=2, default=str),
442
+ "tree_path": str(experiment.cfg.log_dir / "tree_plot.html"),
443
+ }
444
+ return results
445
+
446
+ def render_results_section(self):
447
+ """
448
+ Render the results section with tabs for different outputs.
449
+ """
450
+ st.header("Results")
451
+ if st.session_state.get("results"):
452
+ results = st.session_state.results
453
+ tabs = st.tabs(["Tree Visualization", "Best Solution", "Config", "Journal"])
454
+
455
+ with tabs[0]:
456
+ self.render_tree_visualization(results)
457
+ with tabs[1]:
458
+ self.render_best_solution(results)
459
+ with tabs[2]:
460
+ self.render_config(results)
461
+ with tabs[3]:
462
+ self.render_journal(results)
463
+ else:
464
+ st.info("No results to display. Please run an experiment.")
465
+
466
+ @staticmethod
467
+ def render_tree_visualization(results):
468
+ """
469
+ Render the tree visualization from the experiment results.
470
+
471
+ Args:
472
+ results (dict): The results dictionary containing paths and data.
473
+ """
474
+ if "tree_path" in results:
475
+ tree_path = Path(results["tree_path"])
476
+ logger.info(f"Loading tree visualization from: {tree_path}")
477
+ if tree_path.exists():
478
+ with open(tree_path, "r", encoding="utf-8") as f:
479
+ html_content = f.read()
480
+ components.html(html_content, height=600, scrolling=True)
481
+ else:
482
+ st.error(f"Tree visualization file not found at: {tree_path}")
483
+ logger.error(f"Tree file not found at: {tree_path}")
484
+ else:
485
+ st.info("No tree visualization available for this run.")
486
+
487
+ @staticmethod
488
+ def render_best_solution(results):
489
+ """
490
+ Display the best solution code.
491
+
492
+ Args:
493
+ results (dict): The results dictionary containing the solution.
494
+ """
495
+ if "solution" in results:
496
+ solution_code = results["solution"]
497
+ st.code(solution_code, language="python")
498
+ else:
499
+ st.info("No solution available.")
500
+
501
+ @staticmethod
502
+ def render_config(results):
503
+ """
504
+ Display the configuration used in the experiment.
505
+
506
+ Args:
507
+ results (dict): The results dictionary containing the config.
508
+ """
509
+ if "config" in results:
510
+ st.code(results["config"], language="yaml")
511
+ else:
512
+ st.info("No configuration available.")
513
+
514
+ @staticmethod
515
+ def render_journal(results):
516
+ """
517
+ Display the experiment journal as JSON.
518
+
519
+ Args:
520
+ results (dict): The results dictionary containing the journal.
521
+ """
522
+ if "journal" in results:
523
+ try:
524
+ journal_data = json.loads(results["journal"])
525
+ formatted_journal = json.dumps(journal_data, indent=2)
526
+ st.code(formatted_journal, language="json")
527
+ except json.JSONDecodeError:
528
+ st.code(results["journal"], language="json")
529
+ else:
530
+ st.info("No journal available.")
531
+
532
+
533
+ if __name__ == "__main__":
534
+ app = WebUI()
535
+ app.run()
aide/webui/style.css ADDED
@@ -0,0 +1,156 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ /* Main colors */
2
+ :root {
3
+ --background: #F2F0E7;
4
+ --background-shaded: #EBE8DD;
5
+ --card: #FFFFFF;
6
+ --primary: #0D0F18;
7
+ --accent: #F04370;
8
+ --border: #D4D1C7;
9
+ --accent-hover: #E13D68;
10
+ --accent-light: #FEE5EC;
11
+ }
12
+
13
+ .stVerticalBlock {
14
+ padding-top: 0rem;
15
+ padding-bottom: 0rem;
16
+ }
17
+
18
+ .block-container {
19
+ padding-top: 0rem;
20
+ padding-bottom: 0rem;
21
+ }
22
+ header.stAppHeader {
23
+ display: none;
24
+ }
25
+ section.stMain .block-container {
26
+ padding-top: 0rem;
27
+ z-index: 1;
28
+ }
29
+
30
+ /* Main container */
31
+ .stApp {
32
+ background-color: var(--background);
33
+ height: auto;
34
+ overflow: visible;
35
+ }
36
+
37
+ /* Widgets */
38
+ .stSelectbox,
39
+ .stTextInput,
40
+ .stNumberInput {
41
+ background-color: var(--card);
42
+ border: 1px solid var(--border);
43
+ border-radius: 0.4rem;
44
+ }
45
+
46
+ .stMarkdown {
47
+ color: var(--primary);
48
+ }
49
+
50
+ /* Code block styling */
51
+ .stCodeBlock {
52
+ max-height: 400px;
53
+ overflow-y: auto !important;
54
+ border: 1px solid var(--border);
55
+ border-radius: 0.4rem;
56
+ background-color: var(--background-shaded);
57
+ }
58
+
59
+ /* Custom scrollbar for code blocks */
60
+ .stCodeBlock::-webkit-scrollbar {
61
+ width: 8px;
62
+ height: 8px;
63
+ }
64
+
65
+ .stCodeBlock::-webkit-scrollbar-track {
66
+ background: var(--background-shaded);
67
+ border-radius: 4px;
68
+ }
69
+
70
+ .stCodeBlock::-webkit-scrollbar-thumb {
71
+ background: var(--accent);
72
+ border-radius: 4px;
73
+ }
74
+
75
+ .stCodeBlock::-webkit-scrollbar-thumb:hover {
76
+ background: #e13d68;
77
+ }
78
+
79
+
80
+
81
+ .scrollable-code-container {
82
+ height: 600px;
83
+ overflow-y: auto;
84
+ border: 1px solid var(--border);
85
+ padding: 15px;
86
+ border-radius: 5px;
87
+ background-color: var(--background-shaded);
88
+ }
89
+
90
+ .scrollable-code-container pre {
91
+ margin: 0;
92
+ white-space: pre;
93
+ overflow-x: auto;
94
+ font-family: monospace;
95
+ }
96
+
97
+ .scrollable-code-container code {
98
+ display: block;
99
+ min-width: 100%;
100
+ padding: 0;
101
+ tab-size: 4;
102
+ }
103
+
104
+ /* Add custom scrollbar styling for code containers */
105
+ .scrollable-code-container::-webkit-scrollbar {
106
+ width: 8px;
107
+ height: 8px;
108
+ }
109
+
110
+ .scrollable-code-container::-webkit-scrollbar-track {
111
+ background: var(--background-shaded);
112
+ border-radius: 4px;
113
+ }
114
+
115
+ .scrollable-code-container::-webkit-scrollbar-thumb {
116
+ background: var(--accent);
117
+ border-radius: 4px;
118
+ }
119
+
120
+ .scrollable-code-container::-webkit-scrollbar-thumb:hover {
121
+ background: #e13d68;
122
+ }
123
+
124
+ /* Style for expander */
125
+ .streamlit-expanderHeader {
126
+ background-color: var(--card);
127
+ border: 1px solid var(--border);
128
+ border-radius: 0.4rem;
129
+ padding: 0.5rem !important;
130
+ }
131
+
132
+ .streamlit-expanderHeader:hover {
133
+ border-color: var(--accent);
134
+ }
135
+
136
+ /* Style for expander content */
137
+ .streamlit-expanderContent {
138
+ background-color: var(--background-shaded);
139
+ border: 1px solid var(--border);
140
+ border-radius: 0 0 0.4rem 0.4rem;
141
+ margin-top: -1px;
142
+ padding: 0.5rem !important;
143
+ }
144
+
145
+ /* Style for st.code() blocks */
146
+ .stCode {
147
+ max-height: 600px;
148
+ overflow-y: auto;
149
+ background-color: var(--background-shaded) !important;
150
+ border: 1px solid var(--border) !important;
151
+ border-radius: 5px !important;
152
+ }
153
+
154
+ .stCode pre {
155
+ background-color: var(--background-shaded) !important;
156
+ }
requirements.txt CHANGED
@@ -90,3 +90,4 @@ pyocr
90
  pyarrow
91
  xlrd
92
  backoff
 
 
90
  pyarrow
91
  xlrd
92
  backoff
93
+ streamlit