Spaces:

HUBioDataLab
/

DrugGEN

Running

App Files Files Community

mgyigit commited on Mar 29

Commit

8dde4ec

verified ·

1 Parent(s): 540e177

Update app.py

Browse files

Files changed (1) hide show

app.py +10 -9

app.py CHANGED Viewed

@@ -222,38 +222,38 @@ with gr.Blocks(theme=gr.themes.Ocean()) as demo:
             with gr.Accordion("About DrugGEN Models", open=False):
                 gr.Markdown("""
-                    ## Model Variations
                     ### DrugGEN-AKT1
                     This model is designed to generate molecules targeting the human AKT1 protein (UniProt ID: P31749). Trained with [2,607 bioactive compounds](https://drive.google.com/file/d/1B2OOim5wrUJalixeBTDKXLHY8BAIvNh-/view?usp=drive_link).
                     Molecules larger than 45 heavy atoms were excluded.
                     ### DrugGEN-CDK2
-                    This model is designed to generate molecules targeting the human CDK2 protein (UniProt ID: P24941). Trained with [1,817 bioactive compounds](https://drive.google.com/file/d/1C0CGFKx0I2gdSfbIEgUO7q3K2S1P9ksT/view?usp=drive_link)/
                     Molecules larger than 38 heavy atoms were excluded.
                     ### DrugGEN-NoTarget
                     This is a general-purpose model that generates diverse drug-like molecules without targeting a specific protein. Trained with a general [ChEMBL dataset]((https://drive.google.com/file/d/1oyybQ4oXpzrme_n0kbwc0-CFjvTFSlBG/view?usp=drive_link)
                     Molecules larger than 45 heavy atoms were excluded.
-                    - Useful for exploring chemical space, generating diverse scaffolds, and creating molecules with drug-like properties.
                     For more details, see our [paper on arXiv](https://arxiv.org/abs/2302.07868).
                 """)
             with gr.Accordion("Understanding the Metrics", open=False):
                 gr.Markdown("""
-                    ## Evaluation Metrics
                     ### Basic Metrics
                     - **Validity**: Percentage of generated molecules that are chemically valid
                     - **Uniqueness**: Percentage of unique molecules among valid ones
                     - **Runtime**: Time taken to generate or evaluate the molecules
                     ### Novelty Metrics
-                    - **Novelty (Train)**: Percentage of molecules not found in the [training set](https://drive.google.com/file/d/1oyybQ4oXpzrme_n0kbwc0-CFjvTFSlBG/view?usp=drive_link)
-                    - **Novelty (Inference)**: Percentage of molecules not found in the [test set](https://drive.google.com/file/d/1vMGXqK1SQXB3Od3l80gMWvTEOjJ5MFXP/view?usp=share_link)
-                    - **Novelty (Real Inhibitors)**: Percentage of molecules not found in known inhibitors of the target protein
                     ### Structural Metrics
                     - **Average Length**: Normalized average number of atoms in the generated molecules, normalized by the maximum atom count (e.g., 45 for AKT1/NoTarget, 38 for CDK2)
@@ -302,6 +302,7 @@ with gr.Blocks(theme=gr.themes.Ocean()) as demo:
                 with gr.TabItem("Custom Input SMILES"):
                         custom_smiles = gr.Textbox(
                             label="Input SMILES (one per line, maximum 100 molecules)",
                             placeholder="C(C(=O)O)N\nCCO\n...",
                             lines=10
                         )

             with gr.Accordion("About DrugGEN Models", open=False):
                 gr.Markdown("""
                     ### DrugGEN-AKT1
                     This model is designed to generate molecules targeting the human AKT1 protein (UniProt ID: P31749). Trained with [2,607 bioactive compounds](https://drive.google.com/file/d/1B2OOim5wrUJalixeBTDKXLHY8BAIvNh-/view?usp=drive_link).
                     Molecules larger than 45 heavy atoms were excluded.
                     ### DrugGEN-CDK2
+                    This model is designed to generate molecules targeting the human CDK2 protein (UniProt ID: P24941). Trained with [1,817 bioactive compounds](https://drive.google.com/file/d/1C0CGFKx0I2gdSfbIEgUO7q3K2S1P9ksT/view?usp=drive_link).
                     Molecules larger than 38 heavy atoms were excluded.
                     ### DrugGEN-NoTarget
                     This is a general-purpose model that generates diverse drug-like molecules without targeting a specific protein. Trained with a general [ChEMBL dataset]((https://drive.google.com/file/d/1oyybQ4oXpzrme_n0kbwc0-CFjvTFSlBG/view?usp=drive_link)
                     Molecules larger than 45 heavy atoms were excluded.
+                        - Useful for exploring chemical space, generating diverse scaffolds, and creating molecules with drug-like properties.
                     For more details, see our [paper on arXiv](https://arxiv.org/abs/2302.07868).
                 """)
             with gr.Accordion("Understanding the Metrics", open=False):
                 gr.Markdown("""
                     ### Basic Metrics
                     - **Validity**: Percentage of generated molecules that are chemically valid
                     - **Uniqueness**: Percentage of unique molecules among valid ones
                     - **Runtime**: Time taken to generate or evaluate the molecules
                     ### Novelty Metrics
+                    - **Novelty (Train)**: Percentage of molecules not found in the [training set](https://drive.google.com/file/d/1oyybQ4oXpzrme_n0kbwc0-CFjvTFSlBG/view?usp=drive_link). These molecules are used as inputs to
+                    the generator during training.
+                    - **Novelty (Inference)**: Percentage of molecules not found in the [inference set](https://drive.google.com/file/d/1vMGXqK1SQXB3Od3l80gMWvTEOjJ5MFXP/view?usp=share_link). These molecules are used as inputs
+                    to the generator during inference.
+                    - **Novelty (Real Inhibitors)**: Percentage of molecules not found in known inhibitors of the target protein (look at About DrugGEN Models for details). These molecules are used as inputs to the
+                    discriminator during training.
                     ### Structural Metrics
                     - **Average Length**: Normalized average number of atoms in the generated molecules, normalized by the maximum atom count (e.g., 45 for AKT1/NoTarget, 38 for CDK2)
                 with gr.TabItem("Custom Input SMILES"):
                         custom_smiles = gr.Textbox(
                             label="Input SMILES (one per line, maximum 100 molecules)",
+                            info="This space runs on a CPU, which may result in slower performance. Generating 100 molecules takes approximately 6 minutes. Therefore, we set a 100-molecule cap.\n molecules larger than allowed maximum length (45 for AKT1/NoTarget and 38 for CDK2) and allowed atom types are going to be filtered.",
                             placeholder="C(C(=O)O)N\nCCO\n...",
                             lines=10
                         )