AudioLLM - a zerozeyi Collection

zerozeyi 's Collections

LLM

3D

AudioLLM

updated Jul 29, 2024

GAMA: A Large Audio-Language Model with Advanced Audio Understanding and Complex Reasoning Abilities

Paper • 2406.11768 • Published Jun 17, 2024 • 20
Investigating Decoder-only Large Language Models for Speech-to-text Translation

Paper • 2407.03169 • Published Jul 3, 2024 • 11
PicoAudio: Enabling Precise Timestamp and Frequency Controllability of Audio Events in Text-to-audio Generation

Paper • 2407.02869 • Published Jul 3, 2024 • 21
FunAudioLLM: Voice Understanding and Generation Foundation Models for Natural Interaction Between Humans and LLMs

Paper • 2407.04051 • Published Jul 4, 2024 • 40
Stable Audio Open

Paper • 2407.14358 • Published Jul 19, 2024 • 27