ποΈ Persian Speech Emotion Recognition with SpeechBrain (ShEMO)
This repository provides an ECAPA-TDNN model for speech emotion recognition in Persian, developed using the SpeechBrain toolkit.
The model has been trained on the ShEMO dataset, which includes annotated emotional speech in Persian.
It leverages the ECAPA-TDNN architecture, commonly used in speaker recognition and emotion classification tasks.
Supported Emotion Classes
The model predicts one of the following six emotions: anger, sadness, neutral, surprise, happiness, fear
π¦ How to use this model locally
You can run inference using the included Python script. Here's how:
1οΈβ£ Clone the repository
git lfs install
git clone https://huggingface.co/mobina1380/speechbrain-persian-ser
cd speechbrain-persian-ser
2οΈβ£ Install required libraries
pip install speechbrain torchaudio
3οΈβ£ Run inference on your audio file
Put your Persian speech file in the same folder (WAV, mono, 16kHz). Then:
from inference import predict
emotion = predict("your_audio.wav")
print("Predicted emotion:", emotion)
π Repository Structure
speechbrain-persian-ser/
βββ inference.py # Inference logic
βββ hyperparams.yaml # Model configuration
βββ custom.yaml # Optional training config
βββ save/ # Folder with checkpoints
β βββ CKPT+... # Fine-tuned weights
βββ README.md # You're reading it!
π License
Model: MIT License
Dataset: ShEMO dataset β check original license
π¬ Contact
If you have any questions, feedback, or would like to collaborate, feel free to reach out:
π§ Email: [email protected]
π€ Hugging Face: mobina1380
- Downloads last month
- 2