Spaces:

ftshijt
/

versa

Sleeping

App Files Files Community

ftshijt commited on 18 days ago

Commit

37f33b1

verified ·

1 Parent(s): 3e8c109

Update app.py

Browse files

Files changed (1) hide show

app.py +17 -9

app.py CHANGED Viewed

@@ -367,7 +367,7 @@ def evaluate_audio(gt_file, pred_file, metric_config, include_timestamps=False):
                 # Format results as DataFrame
                 if results:
                     results_df = pd.DataFrame([results])
-                    return results_df, json.dumps(results, indent=2)
                 else:
                     return None, "Evaluation completed but no results were generated."
             else:
@@ -501,7 +501,7 @@ def create_gradio_demo():
                     VERSA is a toolkit dedicated to collecting evaluation metrics in speech and audio quality.
                     It provides a comprehensive connection to cutting-edge evaluation techniques and is tightly integrated with ESPnet.
-                    With full installation, VERSA offers over 60 metrics with 700+ metric variations based on different configurations.
                     These metrics encompass evaluations utilizing diverse external resources, including matching and non-matching
                     reference audio, text transcriptions, and text captions.
@@ -516,14 +516,22 @@ def create_gradio_demo():
                     ### Citation
                     ```
-                    @misc{shi2024versaversatileevaluationtoolkit,
-                      title={VERSA: A Versatile Evaluation Toolkit for Speech, Audio, and Music},
-                      author={Jiatong Shi and Hye-jin Shim and Jinchuan Tian and Siddhant Arora and Haibin Wu and Darius Petermann and Jia Qi Yip and You Zhang and Yuxun Tang and Wangyou Zhang and Dareen Safar Alharthi and Yichen Huang and Koichi Saito and Jionghao Han and Yiwen Zhao and Chris Donahue and Shinji Watanabe},
                       year={2024},
-                      eprint={2412.17667},
-                      archivePrefix={arXiv},
-                      primaryClass={cs.SD},
-                      url={https://arxiv.org/abs/2412.17667},
                     }
                     ```

                 # Format results as DataFrame
                 if results:
                     results_df = pd.DataFrame([results])
+                    return results_df.T, json.dumps(results, indent=2)
                 else:
                     return None, "Evaluation completed but no results were generated."
             else:
                     VERSA is a toolkit dedicated to collecting evaluation metrics in speech and audio quality.
                     It provides a comprehensive connection to cutting-edge evaluation techniques and is tightly integrated with ESPnet.
+                    With full installation, VERSA offers over 80 metrics with 700+ metric variations based on different configurations.
                     These metrics encompass evaluations utilizing diverse external resources, including matching and non-matching
                     reference audio, text transcriptions, and text captions.
                     ### Citation
                     ```
+                    @inproceedings{shi2025versa,
+                    title={{VERSA}: A Versatile Evaluation Toolkit for Speech, Audio, and Music},
+                    author={Jiatong Shi and Hye-jin Shim and Jinchuan Tian and Siddhant Arora and Haibin Wu and Darius Petermann and Jia Qi Yip and You Zhang and Yuxun Tang and Wangyou Zhang and Dareen Safar Alharthi and Yichen Huang and Koichi Saito and Jionghao Han and Yiwen Zhao and Chris Donahue and Shinji Watanabe},
+                    booktitle={2025 Annual Conference of the North American Chapter of the Association for Computational Linguistics -- System Demonstration Track},
+                    year={2025},
+                    url={https://openreview.net/forum?id=zU0hmbnyQm}
+                    }
+                    @inproceedings{shi2024versaversatileevaluationtoolkit,
+                      author={Shi, Jiatong and Tian, Jinchuan and Wu, Yihan and Jung, Jee-Weon and Yip, Jia Qi and Masuyama, Yoshiki and Chen, William and Wu, Yuning and Tang, Yuxun and Baali, Massa and Alharthi, Dareen and Zhang, Dong and Deng, Ruifan and Srivastava, Tejes and Wu, Haibin and Liu, Alexander and Raj, Bhiksha and Jin, Qin and Song, Ruihua and Watanabe, Shinji},
+                      booktitle={2024 IEEE Spoken Language Technology Workshop (SLT)},
+                      title={ESPnet-Codec: Comprehensive Training and Evaluation of Neural Codecs For Audio, Music, and Speech},
                       year={2024},
+                      pages={562-569},
+                      keywords={Training;Measurement;Codecs;Speech coding;Conferences;Focusing;Neural codecs;codec evaluation},
+                      doi={10.1109/SLT61566.2024.10832289}
                     }
                     ```