OWLS: Scaling Laws for Speech Recognition and Translation Collection 🦉 A suite of Whisper-style models from 250M to 18B parameters. Trained on up to 360K hours of data. 16k sampling rate. • 8 items • Updated 6 days ago • 5
video-SALMONN-o1: Reasoning-enhanced Audio-visual Large Language Model Paper • 2502.11775 • Published Feb 17 • 8