Spaces:
Running
Running
Update app.py
Browse files
app.py
CHANGED
@@ -222,38 +222,38 @@ with gr.Blocks(theme=gr.themes.Ocean()) as demo:
|
|
222 |
|
223 |
with gr.Accordion("About DrugGEN Models", open=False):
|
224 |
gr.Markdown("""
|
225 |
-
## Model Variations
|
226 |
-
|
227 |
### DrugGEN-AKT1
|
228 |
This model is designed to generate molecules targeting the human AKT1 protein (UniProt ID: P31749). Trained with [2,607 bioactive compounds](https://drive.google.com/file/d/1B2OOim5wrUJalixeBTDKXLHY8BAIvNh-/view?usp=drive_link).
|
229 |
Molecules larger than 45 heavy atoms were excluded.
|
230 |
|
231 |
### DrugGEN-CDK2
|
232 |
-
This model is designed to generate molecules targeting the human CDK2 protein (UniProt ID: P24941). Trained with [1,817 bioactive compounds](https://drive.google.com/file/d/1C0CGFKx0I2gdSfbIEgUO7q3K2S1P9ksT/view?usp=drive_link)
|
233 |
Molecules larger than 38 heavy atoms were excluded.
|
234 |
|
235 |
### DrugGEN-NoTarget
|
236 |
This is a general-purpose model that generates diverse drug-like molecules without targeting a specific protein. Trained with a general [ChEMBL dataset]((https://drive.google.com/file/d/1oyybQ4oXpzrme_n0kbwc0-CFjvTFSlBG/view?usp=drive_link)
|
237 |
Molecules larger than 45 heavy atoms were excluded.
|
238 |
|
239 |
-
|
|
|
240 |
|
241 |
For more details, see our [paper on arXiv](https://arxiv.org/abs/2302.07868).
|
242 |
""")
|
243 |
|
244 |
with gr.Accordion("Understanding the Metrics", open=False):
|
245 |
gr.Markdown("""
|
246 |
-
## Evaluation Metrics
|
247 |
-
|
248 |
### Basic Metrics
|
249 |
- **Validity**: Percentage of generated molecules that are chemically valid
|
250 |
- **Uniqueness**: Percentage of unique molecules among valid ones
|
251 |
- **Runtime**: Time taken to generate or evaluate the molecules
|
252 |
|
253 |
### Novelty Metrics
|
254 |
-
- **Novelty (Train)**: Percentage of molecules not found in the [training set](https://drive.google.com/file/d/1oyybQ4oXpzrme_n0kbwc0-CFjvTFSlBG/view?usp=drive_link)
|
255 |
-
|
256 |
-
- **Novelty (
|
|
|
|
|
|
|
257 |
|
258 |
### Structural Metrics
|
259 |
- **Average Length**: Normalized average number of atoms in the generated molecules, normalized by the maximum atom count (e.g., 45 for AKT1/NoTarget, 38 for CDK2)
|
@@ -302,6 +302,7 @@ with gr.Blocks(theme=gr.themes.Ocean()) as demo:
|
|
302 |
with gr.TabItem("Custom Input SMILES"):
|
303 |
custom_smiles = gr.Textbox(
|
304 |
label="Input SMILES (one per line, maximum 100 molecules)",
|
|
|
305 |
placeholder="C(C(=O)O)N\nCCO\n...",
|
306 |
lines=10
|
307 |
)
|
|
|
222 |
|
223 |
with gr.Accordion("About DrugGEN Models", open=False):
|
224 |
gr.Markdown("""
|
|
|
|
|
225 |
### DrugGEN-AKT1
|
226 |
This model is designed to generate molecules targeting the human AKT1 protein (UniProt ID: P31749). Trained with [2,607 bioactive compounds](https://drive.google.com/file/d/1B2OOim5wrUJalixeBTDKXLHY8BAIvNh-/view?usp=drive_link).
|
227 |
Molecules larger than 45 heavy atoms were excluded.
|
228 |
|
229 |
### DrugGEN-CDK2
|
230 |
+
This model is designed to generate molecules targeting the human CDK2 protein (UniProt ID: P24941). Trained with [1,817 bioactive compounds](https://drive.google.com/file/d/1C0CGFKx0I2gdSfbIEgUO7q3K2S1P9ksT/view?usp=drive_link).
|
231 |
Molecules larger than 38 heavy atoms were excluded.
|
232 |
|
233 |
### DrugGEN-NoTarget
|
234 |
This is a general-purpose model that generates diverse drug-like molecules without targeting a specific protein. Trained with a general [ChEMBL dataset]((https://drive.google.com/file/d/1oyybQ4oXpzrme_n0kbwc0-CFjvTFSlBG/view?usp=drive_link)
|
235 |
Molecules larger than 45 heavy atoms were excluded.
|
236 |
|
237 |
+
- Useful for exploring chemical space, generating diverse scaffolds, and creating molecules with drug-like properties.
|
238 |
+
|
239 |
|
240 |
For more details, see our [paper on arXiv](https://arxiv.org/abs/2302.07868).
|
241 |
""")
|
242 |
|
243 |
with gr.Accordion("Understanding the Metrics", open=False):
|
244 |
gr.Markdown("""
|
|
|
|
|
245 |
### Basic Metrics
|
246 |
- **Validity**: Percentage of generated molecules that are chemically valid
|
247 |
- **Uniqueness**: Percentage of unique molecules among valid ones
|
248 |
- **Runtime**: Time taken to generate or evaluate the molecules
|
249 |
|
250 |
### Novelty Metrics
|
251 |
+
- **Novelty (Train)**: Percentage of molecules not found in the [training set](https://drive.google.com/file/d/1oyybQ4oXpzrme_n0kbwc0-CFjvTFSlBG/view?usp=drive_link). These molecules are used as inputs to
|
252 |
+
the generator during training.
|
253 |
+
- **Novelty (Inference)**: Percentage of molecules not found in the [inference set](https://drive.google.com/file/d/1vMGXqK1SQXB3Od3l80gMWvTEOjJ5MFXP/view?usp=share_link). These molecules are used as inputs
|
254 |
+
to the generator during inference.
|
255 |
+
- **Novelty (Real Inhibitors)**: Percentage of molecules not found in known inhibitors of the target protein (look at About DrugGEN Models for details). These molecules are used as inputs to the
|
256 |
+
discriminator during training.
|
257 |
|
258 |
### Structural Metrics
|
259 |
- **Average Length**: Normalized average number of atoms in the generated molecules, normalized by the maximum atom count (e.g., 45 for AKT1/NoTarget, 38 for CDK2)
|
|
|
302 |
with gr.TabItem("Custom Input SMILES"):
|
303 |
custom_smiles = gr.Textbox(
|
304 |
label="Input SMILES (one per line, maximum 100 molecules)",
|
305 |
+
info="This space runs on a CPU, which may result in slower performance. Generating 100 molecules takes approximately 6 minutes. Therefore, we set a 100-molecule cap.\n molecules larger than allowed maximum length (45 for AKT1/NoTarget and 38 for CDK2) and allowed atom types are going to be filtered.",
|
306 |
placeholder="C(C(=O)O)N\nCCO\n...",
|
307 |
lines=10
|
308 |
)
|