Spaces:

bunyaminergen
/

CallyticsDemo

Running

App Files Files Community

CallyticsDemo / .docs /documentation /RESOURCES.md

bunyaminergen

Initial

1b97239 about 2 months ago

preview code

raw

history blame

7.28 kB

	# Resources

	---

	## Github

	- [NeMo](https://github.com/NVIDIA/NeMo)
	- [Llama](https://github.com/facebookresearch/llama)
	- [Demucs](https://github.com/facebookresearch/demucs)
	- [Whisper](https://github.com/openai/whisper)
	- [Whisper NeMo Diarization](https://github.com/MahmoudAshraf97/whisper-diarization)
	- [Text to speech alignment using CTC forced alignment](https://github.com/MahmoudAshraf97/ctc-forced-aligner)
	- [Utilities intended for use with Llama models.](https://github.com/meta-llama/llama-models/)
	- [Llama Recipes: Examples to get started using the Llama models from Meta](https://github.com/meta-llama/llama-recipes)
	- [timsainb/noisereduce: Noise reduction in python using spectral gating](https://github.com/timsainb/noisereduce/)
	- [pyannote/pyannote-audio: Neural building blocks for speaker diarization](https://github.com/pyannote/pyannote-audio)
	- [microsoft/DNS-Challenge: This repo contains the scripts, models, and required files for the Deep Noise Suppression (DNS) Challenge.](https://github.com/microsoft/DNS-Challenge)
	- [WenzheLiu-Speech/awesome-speech-enhancement: speech enhancement\speech seperation\sound source localization](https://github.com/WenzheLiu-Speech/awesome-speech-enhancement)
	- [nanahou/Awesome-Speech-Enhancement: A tutorial for Speech Enhancement researchers and practitioners. The purpose of this repo is to organize the world’s resources for speech enhancement and make them universally accessible and useful.](https://github.com/nanahou/Awesome-Speech-Enhancement)
	- [jonashaag/speech-enhancement: Collection of papers, datasets and tools on the topic of Speech Dereverberation and Speech Enhancement](https://github.com/jonashaag/speech-enhancement)
	- [yxlu-0102/MP-SENet: Explicit Estimation of Magnitude and Phase Spectra in Parallel for High-Quality Speech Enhancement](https://github.com/yxlu-0102/MP-SENet)
	- [Explicit Estimation of Magnitude and Phase Spectra in Parallel for High-Quality Speech Enhancement](https://yxlu-0102.github.io/MP-SENet/)
	- [## SUPERSEDED: THIS DATASET HAS BEEN REPLACED. ## Noisy speech database for training speech enhancement algorithms and TTS models](https://datashare.ed.ac.uk/handle/10283/1942)

	---

	## Web

	- [Llama](https://www.llama.com/)
	- [Download Llama](https://www.llama.com/llama-downloads/)
	- [Llama 3.2 Requirements](https://llamaimodel.com/requirements-3-2/)
	- [Average handle time (AHT): Formula and tips for improvement](https://www.zendesk.com/blog/average-handle-time/)

	---

	## Notebooks

	- [Hybrid Demucs Music Source Separation](https://colab.research.google.com/drive/1dC9nVxk3V_VPjUADsnFu8EiT-xnU1tGH)

	---

	## PyPI

	- [demucs](https://pypi.org/project/demucs/)
	- [MPSENet](https://pypi.org/project/MPSENet/)

	---

	## Errors

	- [`The file is already fully retrieved; nothing to do.`](https://github.com/facebookresearch/llama/issues/760)

	---

	## Paper

	- [Dual-Path Transformer Network: Direct Context-Aware Modeling for End-to-End Monaural Speech Separation](https://arxiv.org/abs/2007.13975)
	- [MP-SENet: A Speech Enhancement Model with Parallel Denoising of Magnitude and Phase Spectra](https://arxiv.org/abs/2305.13686)
	- [FINALLY: fast and universal speech enhancement with studio-like quality](https://arxiv.org/abs/2410.05920)
	- [Explicit Estimation of Magnitude and Phase Spectra in Parallel for High-Quality Speech Enhancement](https://arxiv.org/abs/2308.08926)
	- [\[2410.08235\] A Recurrent Neural Network Approach to the Answering Machine Detection Problem](https://arxiv.org/abs/2410.08235)

	---

	## Youtube

	- [A Course on Speech Enhancement](https://www.youtube.com/playlist?list=PLO9nFIQB53_DU8o0fToNdNFdZuDxD9fAN)
	- [COMS 4995 Final on Speech Enhancement](https://www.youtube.com/watch?v=uRwlSh1FMzc&t=74s)
	- [Achieving Studio-Quality Speech with Generative AI](https://www.youtube.com/watch?v=UxbEjpLMU8s)
	- [How to Fix Bad Podcast Audio](https://www.youtube.com/watch?v=0mPkPQNHsZc)
	- [Speech Enhancement for Cochlear Implant Recipients Using Deep Complex Convolution Transformer With F](https://www.youtube.com/watch?v=i1qTgjMtS2Y)
	- [Vibravox: A Dataset of French Speech Captured with Body-conduction Audio Sensors](https://www.youtube.com/watch?v=4jiQdotz6qY)
	- [2024 종합설계 3팀 2차, Neural Network for Speech Enhancement](https://www.youtube.com/watch?v=yOfTYuc9FEQ)
	- [MIAI Deeptails Seminar : Generative Models as Data-driven Priors for Speech Enhancement](https://www.youtube.com/watch?v=XSLgUsgyzUA)
	- [Hardware Efficient Speech Enhancement With Noise Aware Multi Target Deep Learning](https://www.youtube.com/watch?v=qO6JqDUQlsI)
	- [Diffusion Models for Speech Enhancement \| Julius Richter](https://www.youtube.com/watch?v=HMrs6YWDl5M)
	- [Speech Enhancement: Basics & Key Details](https://www.youtube.com/watch?v=5kItH2pq_3E)
	- [Guided Speech Enhancement Network (ICASSP 2023)](https://www.youtube.com/watch?v=JoDqXkAjlh4)
	- [VSANet: Real-time Speech Enhancement Based on Voice Activity Detection and Causal Spatial Attention](https://www.youtube.com/watch?v=GP39vFA2E48)
	- [Research intern talk: Unified speech enhancement approach for speech degradation & noise suppression](https://www.youtube.com/watch?v=_ggfv6eMIJs)
	- [Magnitude and phase spectrum with example](https://www.youtube.com/watch?v=MFOjUgafq0k)
	- [Deep Learning In Audio for Absolute Beginners: From No Experience & No Datasets to a Deployed Model](https://www.youtube.com/watch?v=sqrah49GUkI)
	- [Look Once to Hear: Target Speech Hearing with Noisy Examples](https://www.youtube.com/watch?v=V-XCfnjfQmM)

	---

	## Wikipedia

	- [Speech enhancement](https://en.m.wikipedia.org/wiki/Speech_enhancement)

	---

	## Hugging Face

	- [Models(asteroid)](https://huggingface.co/models?library=asteroid)
	- [cankeles/DPTNet_WHAMR_enhsingle_16k](https://huggingface.co/cankeles/DPTNet_WHAMR_enhsingle_16k)
	- [JacobLinCool/MP-SENet-VB](https://huggingface.co/JacobLinCool/MP-SENet-VB)
	- [JacobLinCool/MP-SENet-DNS](https://huggingface.co/JacobLinCool/MP-SENet-DNS)
	- [ENOT-AutoDL/MP-SENet](https://huggingface.co/ENOT-AutoDL/MP-SENet)

	---

	## Web

	- [Dual-Path Transformer Network: Direct Context-Aware Modeling for End-to-End Monaural Speech Separation](https://paperswithcode.com/paper/dual-path-transformer-network-direct-context-1)
	- [The Audio Developer Conference - ADC is an annual event celebrating all audio development technologies, from music applications and game audio to audio processing and embedded systems.](https://audio.dev/)
	- [Look Once to Hear: Target Speech Hearing with Noisy Examples - CHI '24](https://programs.sigchi.org/chi/2024/program/content/147319)
	- [Reinforcement Learning Based Speech Enhancement for Robust Speech Recognition > Introduction \| Class Central Classroom](https://www.classcentral.com/classroom/youtube-reinforcement-learning-based-speech-enhancement-for-robust-speech-recognition-131999)
	- [Sound classification with YAMNet TensorFlow Hub](https://www.tensorflow.org/hub/tutorials/yamnet)
	- [DEEP-VOICE: DeepFake Voice Recognition Dataset \| Papers With Code](https://paperswithcode.com/dataset/deep-voice-deepfake-voice-recognition)

	---

	## Dataset

	- [VoiceBank+DEMAND](https://datashare.ed.ac.uk/handle/10283/1942)
	- [VoiceBank+DEMAND](https://drive.google.com/drive/folders/19I_thf6F396y5gZxLTxYIojZXC0Ywm8l)

	---