Spaces:
Runtime error
A newer version of the Gradio SDK is available:
5.29.0
VenusFactory Download Tab User Guide
InterPro Metadata
Description: Downloads protein domain information from InterPro database.
Source: InterPro Database
Download Options:
- Single ID: Download data for a specific InterPro domain (e.g., IPR000001)
- From JSON: Batch download using a JSON file containing multiple InterPro entries
Output Format:
download/interpro_domain/
βββ IPR000001/
βββ detail.json # Detailed protein information
βββ meta.json # Metadata including accession and protein count
βββ uids.txt # List of UniProt IDs associated with this domain
RCSB Metadata
Description: Downloads structural metadata from the RCSB Protein Data Bank.
Source: RCSB PDB
Download Options:
- Single ID: Download metadata for a specific PDB entry (e.g., 1a0j)
- From File: Batch download using a text file containing PDB IDs
Output Format:
download/rcsb_metadata/
βββ 1a0j.json # Contains structure metadata including:
# - Resolution
# - Experimental method
# - Publication info
# - Chain information
UniProt Sequences
Description: Downloads protein sequences from UniProt database.
Source: UniProt
Download Options:
- Single ID: Download sequence for a specific UniProt entry (e.g., P00734)
- From File: Batch download using a text file containing UniProt IDs
- Merge Option: Combine all sequences into a single FASTA file
Output Format:
download/uniprot_sequences/
βββ P00734.fasta # Individual FASTA files (when not merged)
βββ merged.fasta # Combined sequences (when merge option is selected)
RCSB Structures
Description: Downloads 3D structure files from RCSB Protein Data Bank.
Source: RCSB PDB
Download Options:
- Single ID: Download structure for a specific PDB entry
- From File: Batch download using a text file containing PDB IDs
- File Types:
- cif: mmCIF format (recommended)
- pdb: Legacy PDB format
- xml: PDBML/XML format
- sf: Structure factors
- mr: NMR restraints
- Unzip Option: Automatically decompress downloaded files
Output Format:
download/rcsb_structures/
βββ 1a0j.pdb # Uncompressed structure file (with unzip)
βββ 1a0j.pdb.gz # Compressed structure file (without unzip)
AlphaFold2 Structures
Description: Downloads predicted protein structures from AlphaFold Protein Structure Database.
Source: AlphaFold DB
Download Options:
- Single ID: Download structure for a specific UniProt entry
- From File: Batch download using a text file containing UniProt IDs
- Index Level: Organize files in subdirectories based on ID prefix
Output Format:
download/alphafold2_structures/
βββ P/ # With index_level=1
βββ P0/ # With index_level=2
βββ P00734.pdb # AlphaFold predicted structure
Common Features
- Error Handling: All components support error file generation
- Output Directory: Customizable output paths
- Batch Processing: Support for multiple IDs via file input
- Progress Tracking: Real-time download progress and status updates
Input File Formats
- PDB ID List (for RCSB downloads):
1a0j
4hhb
1hho
- UniProt ID List (for UniProt and AlphaFold):
P00734
P61823
Q8WZ42
- InterPro JSON (for batch InterPro downloads):
[
{
"metadata": {
"accession": "IPR000001"
}
},
{
"metadata": {
"accession": "IPR000002"
}
}
]
Error Files
When enabled, failed downloads are logged to failed.txt
in the output directory:
P00734 - Download failed: 404 Not Found
1a0j - Connection timeout