-
Self-Rewarding Language Models
Paper • 2401.10020 • Published • 148 -
Orion-14B: Open-source Multilingual Large Language Models
Paper • 2401.12246 • Published • 13 -
MambaByte: Token-free Selective State Space Model
Paper • 2401.13660 • Published • 59 -
MM-LLMs: Recent Advances in MultiModal Large Language Models
Paper • 2401.13601 • Published • 49
Collections
Discover the best community collections!
Collections including paper arxiv:2410.24198
-
OpenCodeInstruct: A Large-scale Instruction Tuning Dataset for Code LLMs
Paper • 2504.04030 • Published -
KodCode: A Diverse, Challenging, and Verifiable Synthetic Dataset for Coding
Paper • 2503.02951 • Published • 30 -
BigCodeBench: Benchmarking Code Generation with Diverse Function Calls and Complex Instructions
Paper • 2406.15877 • Published • 47 -
Magicoder: Source Code Is All You Need
Paper • 2312.02120 • Published • 82
-
Video Creation by Demonstration
Paper • 2412.09551 • Published • 9 -
DiffSensei: Bridging Multi-Modal LLMs and Diffusion Models for Customized Manga Generation
Paper • 2412.07589 • Published • 49 -
Unraveling the Complexity of Memory in RL Agents: an Approach for Classification and Evaluation
Paper • 2412.06531 • Published • 73 -
APOLLO: SGD-like Memory, AdamW-level Performance
Paper • 2412.05270 • Published • 39
-
PopAlign: Diversifying Contrasting Patterns for a More Comprehensive Alignment
Paper • 2410.13785 • Published • 19 -
Aligning Large Language Models via Self-Steering Optimization
Paper • 2410.17131 • Published • 23 -
Baichuan Alignment Technical Report
Paper • 2410.14940 • Published • 52 -
SemiEvol: Semi-supervised Fine-tuning for LLM Adaptation
Paper • 2410.14745 • Published • 48
-
glaiveai/glaive-coder-7b
Text Generation • Updated • 10 • 54 -
glaiveai/glaive-code-assistant-v3
Viewer • Updated • 950k • 263 • 52 -
ibm-granite/granite-3b-code-base-128k
Text Generation • Updated • 114 • 6 -
Granite Code Models: A Family of Open Foundation Models for Code Intelligence
Paper • 2405.04324 • Published • 22
-
LLMs + Persona-Plug = Personalized LLMs
Paper • 2409.11901 • Published • 34 -
To CoT or not to CoT? Chain-of-thought helps mainly on math and symbolic reasoning
Paper • 2409.12183 • Published • 39 -
Chain of Thought Empowers Transformers to Solve Inherently Serial Problems
Paper • 2402.12875 • Published • 13 -
TPI-LLM: Serving 70B-scale LLMs Efficiently on Low-resource Edge Devices
Paper • 2410.00531 • Published • 33
-
LinFusion: 1 GPU, 1 Minute, 16K Image
Paper • 2409.02097 • Published • 35 -
Phidias: A Generative Model for Creating 3D Content from Text, Image, and 3D Conditions with Reference-Augmented Diffusion
Paper • 2409.11406 • Published • 28 -
Diffusion Models Are Real-Time Game Engines
Paper • 2408.14837 • Published • 126 -
Segment Anything with Multiple Modalities
Paper • 2408.09085 • Published • 23