ViP-LLaVA - a llava-hf Collection

llava-hf 's Collections

LLaVa-NeXT-Video

LLaVa-Interleave

LLaVA-Onevision

ViP-LLaVA

updated Mar 25, 2024

ViP-LLaVA is a novel approach to allow large multimodal models understand arbitrary visual prompts.

Making Large Multimodal Models Understand Arbitrary Visual Prompts

Paper • 2312.00784 • Published Dec 1, 2023 • 2
llava-hf/vip-llava-7b-hf

Image-Text-to-Text • Updated Jan 27 • 12.8k • 17
llava-hf/vip-llava-13b-hf

Image-Text-to-Text • Updated Jan 27 • 331 • 11