LLaMA-VID - a YanweiLi Collection

YanweiLi 's Collections

MGM

LLaMA-VID

updated Dec 3, 2023

LLaMA-VID checkpoints. Please refer to project page for more detail: https://llama-vid.github.io/

YanweiLi/llama-vid-7b-pretrain-224

Text Generation • Updated Dec 3, 2023 • 6
YanweiLi/llama-vid-7b-pretrain-224-video-fps-1

Text Generation • Updated Dec 3, 2023 • 30 • 2
YanweiLi/llama-vid-7b-pretrain-336

Text Generation • Updated Dec 3, 2023 • 16
YanweiLi/llama-vid-13b-pretrain-336

Text Generation • Updated Dec 3, 2023 • 5
YanweiLi/llama-vid-13b-pretrain-224-video-fps-1

Text Generation • Updated Dec 3, 2023 • 5
YanweiLi/llama-vid-7b-full-336

Text Generation • Updated Dec 2, 2023 • 18
YanweiLi/llama-vid-7b-full-224-video-fps-1

Text Generation • Updated Dec 3, 2023 • 87 • 9
YanweiLi/llama-vid-7b-full-224

Text Generation • Updated Dec 3, 2023 • 12 • 1
YanweiLi/llama-vid-13b-full-224-video-fps-1

Text Generation • Updated Dec 3, 2023 • 10 • 2
YanweiLi/llama-vid-13b-full-336

Text Generation • Updated Dec 2, 2023 • 8
LLaMA-VID: An Image is Worth 2 Tokens in Large Language Models

Paper • 2311.17043 • Published Nov 28, 2023