aimagelab/LLaVA_MORE-llama_3_1-8B-S2-siglip-finetuning Image-Text-to-Text • Updated 14 days ago • 24 • 2
aimagelab/LLaVA_MORE-llama_3_1-8B-S2-siglip-pretrain Image-Text-to-Text • Updated 14 days ago • 5
aimagelab/LLaVA_MORE-llama_3_1-8B-siglip-finetuning Image-Text-to-Text • Updated 14 days ago • 38 • 1
aimagelab/LLaVA_MORE-llama_3_1-8B-finetuning Image-Text-to-Text • Updated 14 days ago • 225 • 9
Seaweed-7B: Cost-Effective Training of Video Generation Foundation Model Paper • 2504.08685 • Published 27 days ago • 123
Running 2.56k 2.56k The Ultra-Scale Playbook 🌌 The ultimate guide to training LLM on large GPU Clusters
ReflectiVA Collection Model and data for ReflectiVA: Augmenting Multimodal LLMs with Self-Reflective Tokens for Knowledge-based Visual Question Answering [CVPR 2025] • 2 items • Updated 26 days ago
ReflectiVA Collection Models and data for ReflectiVA: Augmenting Multimodal LLMs with Self-Reflective Tokens for Knowledge-based Visual Question Answering [CVPR 2025] • 3 items • Updated Apr 5
ReflectiVA Collection Models and data for ReflectiVA: Augmenting Multimodal LLMs with Self-Reflective Tokens for Knowledge-based Visual Question Answering [CVPR 2025] • 3 items • Updated Apr 5