18 1 100

9-Volt Fan

9voltfan2009

AI & ML interests

None yet

Recent Activity

liked a Space about 18 hours ago

ACE-Step/ACE-Step

reacted to AdinaY's post with 😎 about 18 hours ago

ACE-Step 🎵 a music generation foundation model released by StepFun & ACEStudio Model: https://huggingface.co/ACE-Step/ACE-Step-v1-3.5B Demo: https://huggingface.co/spaces/ACE-Step/ACE-Step ✨ 3.5B, Apache2.0 licensed ✨ 115× faster than LLMs (4-min music in 20s on A100) ✨ Diffusion + DCAE + linear transformer = speed + coherence ✨ Supports voice cloning, remixing, lyric editing & more

liked a Space about 18 hours ago

styletts2/styletts2

View all activity

Organizations

9voltfan2009's activity

liked a Space about 18 hours ago

140

ACE Step

😻

A Step Towards Music Generation Foundation Model

reacted to AdinaY's post with 😎 about 18 hours ago

Post

3316

ACE-Step 🎵 a music generation foundation model released by
StepFun & ACEStudio

Model: ACE-Step/ACE-Step-v1-3.5B
Demo: ACE-Step/ACE-Step

✨ 3.5B, Apache2.0 licensed
✨ 115× faster than LLMs (4-min music in 20s on A100)
✨ Diffusion + DCAE + linear transformer = speed + coherence
✨ Supports voice cloning, remixing, lyric editing & more

1 reply

liked 4 Spaces about 18 hours ago

647

StyleTTS 2

🗣

Efficient, fast, and natural text to speech with StyleTTS 2!

StyleTTS2 Lite

🦆

Generate audio from text with customizable voice

Audio Super Resolution

🎧

Enhance audio quality with AudioSR

223

ClearerVoice-Studio (Speech Enhancement, Separation and Extraction)

📈

Better AI powered platform to purify your speech signal

reacted to prithivMLmods's post with 👍 about 19 hours ago

Post

1951

Well, here’s the updated version with the 20,000+ entry sampled dataset for Watermark Filter Content Moderation models incl. [Food25, Weather, Watermark, Marathi/Hindi Sign Language Detection], post-trained from the base models: sigLip2 patch16 224 — now with mixed aspect ratios for better performance and reduced misclassification. 🔥

Models :
➮ Watermark-Detection : prithivMLmods/Watermark-Detection-SigLIP2
⌨︎ Watermark Detection & Batch Image Processing Experimentals, Colab Notebook : https://colab.research.google.com/drive/1mlQrSsSjkGimUt0VyRi3SoWMv8OMyvw3?usp=drive_link
➮ Weather-Image-Classification : prithivMLmods/Weather-Image-Classification
➮ TurkishFoods-25 : prithivMLmods/TurkishFoods-25
➮ Marathi-Sign-Language-Detection : prithivMLmods/Marathi-Sign-Language-Detection
➮ Hindi-Sign-Language-Detection : prithivMLmods/Hindi-Sign-Language-Detection

Datasets :
Watermark : qwertyforce/scenery_watermarks
Weather : prithivMLmods/WeatherNet-05-18039
Turkish Foods 25 : yunusserhat/TurkishFoods-25
Marathi Sign Language : VinayHajare/Marathi-Sign-Language
Hindi Sign Language : Vedant3907/Hindi-Sign-Language-Dataset

Collection : prithivMLmods/content-filters-siglip2-vit-68197e3357d4de18fb3b4d2b

reacted to nyuuzyou's post with ❤️🔥 about 19 hours ago

Post

3201

nyuuzyou/svgfind 👀

Well, everything happens for the first time 🤗. Thank you all!

liked a Space about 20 hours ago

358

VALL E X

🎙

Generate audio from text using voice prompts

liked 2 Spaces 3 days ago

477

Qwen3 Demo

📊

Generate responses to your messages

WeShopAI Bad Hand Fixer

🖐

Enhance hand features for flawless, natural-looking images

updated a model 4 days ago

9voltfan2009/DorkDiaries-RVC

Updated 4 days ago

reacted to fdaudens's post with 👍🔥 5 days ago

Post

2914

Forget everything you know about transcription models - NVIDIA's parakeet-tdt-0.6b-v2 changed the game for me!

Just tested it with Steve Jobs' Stanford speech and was speechless (pun intended). The video isn’t sped up.

3 things that floored me:
- Transcription took just 10 seconds for a 15-min file
- Got a CSV with perfect timestamps, punctuation & capitalization
- Stunning accuracy (correctly captured "Reed College" and other specifics)

NVIDIA also released a demo where you can click any transcribed segment to play it instantly.

The improvement is significant: number 1 on the ASR Leaderboard, 6% error rate (best in class) with complete commercial freedom (cc-by-4.0 license).

Time to update those Whisper pipelines! H/t @Steveeeeeeen for the finding!

Model: nvidia/parakeet-tdt-0.6b-v2
Demo: nvidia/parakeet-tdt-0.6b-v2
ASR Leaderboard: hf-audio/open_asr_leaderboard

1 reply

liked a Space 6 days ago

Karaoke Chaos

🎤

Separate and transcribe duet audio into individual voices

reacted to DevinGrey's post with 👀 6 days ago

Post

1481

hello All. I am new to all of this and just beginning to learn how to use hugging face and AI in general. How can I access an ai code developer for help in setting up a website?

3 replies

updated a collection 6 days ago

RVC Voice Models

Collection

2 items • Updated 6 days ago

reacted to lukmanaj's post with 😎 6 days ago

Post

2162

I’m excited to share that I’ve completed the Hugging Face Agents Course and earned my certificate.

Over the past few months, I explored how to build intelligent, autonomous agents using cutting-edge tools like smolagents, LlamaIndex, and LangGraph. The course covered everything from the fundamentals of agents to advanced topics like fine-tuning for function-calling, observability, evaluation, and even agents in games.

Some key content included:

1. Introduction to AI Agents

2. Agentic RAG use cases

3. Multi-framework implementation: smolagents, LlamaIndex, and LangGraph

4. Building, testing, and certifying a complete agent project

This was a hands-on, practical experience that deepened my understanding of how to design reliable, tool-using LLM agents. Looking forward to leveraging these skills in real-world applications in healthcare, logistics, and beyond.

Many thanks to the Hugging Face team for putting this together.
Let’s build safe and useful agents!

7 replies