transformers torch torchaudio pydub datasets ipywidgets==8.1.3 numpy==1.26.4 soundfile SentencePiece jiwer evaluate huggingface-hub speechbrain accelerate librosa matplotlib tensorboardX evaluate jiwer ffmpeg