Collections
Discover the best community collections!
Collections including paper arxiv:2503.19786
-
Sparse Logit Sampling: Accelerating Knowledge Distillation in LLMs
Paper • 2503.16870 • Published • 5 -
Gemma 3 Technical Report
Paper • 2503.19786 • Published • 50 -
Qwen2.5-Omni Technical Report
Paper • 2503.20215 • Published • 150 -
Think Twice: Enhancing LLM Reasoning by Scaling Multi-round Test-time Thinking
Paper • 2503.19855 • Published • 27
-
Gemma 3 Technical Report
Paper • 2503.19786 • Published • 50 -
Kimi-VL Technical Report
Paper • 2504.07491 • Published • 125 -
InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models
Paper • 2504.10479 • Published • 255 -
FUSION: Fully Integration of Vision-Language Representations for Deep Cross-Modal Understanding
Paper • 2504.09925 • Published • 38
-
Reinforcement Learning: An Overview
Paper • 2412.05265 • Published • 7 -
Fish-Speech: Leveraging Large Language Models for Advanced Multilingual Text-to-Speech Synthesis
Paper • 2411.01156 • Published • 6 -
VBench-2.0: Advancing Video Generation Benchmark Suite for Intrinsic Faithfulness
Paper • 2503.21755 • Published • 34 -
Qwen2.5-Omni Technical Report
Paper • 2503.20215 • Published • 150
-
MotionBench: Benchmarking and Improving Fine-grained Video Motion Understanding for Vision Language Models
Paper • 2501.02955 • Published • 45 -
2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining
Paper • 2501.00958 • Published • 107 -
MMVU: Measuring Expert-Level Multi-Discipline Video Understanding
Paper • 2501.12380 • Published • 85 -
VideoWorld: Exploring Knowledge Learning from Unlabeled Videos
Paper • 2501.09781 • Published • 29
-
Phi-4 Technical Report
Paper • 2412.08905 • Published • 116 -
Evaluating and Aligning CodeLLMs on Human Preference
Paper • 2412.05210 • Published • 51 -
Evaluating Language Models as Synthetic Data Generators
Paper • 2412.03679 • Published • 49 -
Yi-Lightning Technical Report
Paper • 2412.01253 • Published • 29
-
Groma: Localized Visual Tokenization for Grounding Multimodal Large Language Models
Paper • 2404.13013 • Published • 32 -
Toward Self-Improvement of LLMs via Imagination, Searching, and Criticizing
Paper • 2404.12253 • Published • 56 -
Data-Efficient Contrastive Language-Image Pretraining: Prioritizing Data Quality over Quantity
Paper • 2403.12267 • Published -
No More Adam: Learning Rate Scaling at Initialization is All You Need
Paper • 2412.11768 • Published • 44