Respair commited on
Commit
797f680
·
verified ·
1 Parent(s): f7c483f

Update demo.py

Browse files
Files changed (1) hide show
  1. demo.py +4 -2
demo.py CHANGED
@@ -317,11 +317,13 @@ with gr.Blocks() as longform:
317
  outputs=[audio_longform],
318
  concurrency_limit=4)
319
 
320
- # --- User Guide / Info Tab (Reformatted User Text) ---
321
- # Convert Markdown-like text to basic HTML for styling
322
  user_guide_html = f"""
323
  <div style="background-color: rgba(30, 30, 30, 0.9); color: #f0f0f0; padding: 20px; border-radius: 10px; border: 1px solid #444;">
324
  <h2 style="border-bottom: 1px solid #555; padding-bottom: 5px;">Quick Notes:</h2>
 
 
 
 
325
  <p>Everything in this demo & the repo (coming soon) is experimental. The main idea is just playing around with different things to see what works when you're limited to training on a pair of RTX 3090s.</p>
326
  <p>The data used for the english model is rough and pretty tough for any TTS model (think debates, real conversations, plus a little bit of cleaner professional performances). It mostly comes from public sources or third parties (no TOS signed). I'll probably write a blog post later with more details.</p>
327
  <p>So far I focused on English and Russian, more can be covered.</p>
 
317
  outputs=[audio_longform],
318
  concurrency_limit=4)
319
 
 
 
320
  user_guide_html = f"""
321
  <div style="background-color: rgba(30, 30, 30, 0.9); color: #f0f0f0; padding: 20px; border-radius: 10px; border: 1px solid #444;">
322
  <h2 style="border-bottom: 1px solid #555; padding-bottom: 5px;">Quick Notes:</h2>
323
+
324
+ <p> This is run on a single RTX 3090. </p>
325
+ <p> These networks can only generate natural speech with correct intonations (i.e generating NSFW, non-speech sounds, stutters etc. doesn't work.) </p>
326
+ <p> I will gradually update here and -> [Github](https://github.com/Respaired/Project_Kalliope) </p>
327
  <p>Everything in this demo & the repo (coming soon) is experimental. The main idea is just playing around with different things to see what works when you're limited to training on a pair of RTX 3090s.</p>
328
  <p>The data used for the english model is rough and pretty tough for any TTS model (think debates, real conversations, plus a little bit of cleaner professional performances). It mostly comes from public sources or third parties (no TOS signed). I'll probably write a blog post later with more details.</p>
329
  <p>So far I focused on English and Russian, more can be covered.</p>