Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2504.16929

EVA-CLIP-18B: Scaling CLIP to 18 Billion Parameters

Paper • 2402.04252 • Published Feb 6, 2024 • 28
Vision Superalignment: Weak-to-Strong Generalization for Vision Foundation Models

Paper • 2402.03749 • Published Feb 6, 2024 • 13
ScreenAI: A Vision-Language Model for UI and Infographics Understanding

Paper • 2402.04615 • Published Feb 7, 2024 • 44
EfficientViT-SAM: Accelerated Segment Anything Model Without Performance Loss

Paper • 2402.05008 • Published Feb 7, 2024 • 23

about 22 hours ago

I-Con: A Unifying Framework for Representation Learning

Paper • 2504.16929 • Published 15 days ago • 29
LLMs are Greedy Agents: Effects of RL Fine-tuning on Decision-Making Abilities

Paper • 2504.16078 • Published 16 days ago • 20
WALL-E 2.0: World Alignment by NeuroSymbolic Learning improves World Model-based LLM Agents

Paper • 2504.15785 • Published 16 days ago • 19
OTC: Optimal Tool Calls via Reinforcement Learning

Paper • 2504.14870 • Published 18 days ago • 33

I-Con: A Unifying Framework for Representation Learning

Paper • 2504.16929 • Published 15 days ago • 29

I-Con: A Unifying Framework for Representation Learning

Paper • 2504.16929 • Published 15 days ago • 29

Theory and Representation learning

I-Con: A Unifying Framework for Representation Learning

Paper • 2504.16929 • Published 15 days ago • 29

about 9 hours ago

CoRAG: Collaborative Retrieval-Augmented Generation

Paper • 2504.01883 • Published Apr 2 • 10
VL-Rethinker: Incentivizing Self-Reflection of Vision-Language Models with Reinforcement Learning

Paper • 2504.08837 • Published 28 days ago • 42
Mavors: Multi-granularity Video Representation for Multimodal Large Language Model

Paper • 2504.10068 • Published 24 days ago • 30
xVerify: Efficient Answer Verifier for Reasoning Model Evaluations

Paper • 2504.10481 • Published 24 days ago • 84

Foundational Deep Learning - Architecture

Forgetting Transformer: Softmax Attention with a Forget Gate

Paper • 2503.02130 • Published Mar 3 • 32
L^2M: Mutual Information Scaling Law for Long-Context Language Modeling

Paper • 2503.04725 • Published Mar 6 • 20
Transformers without Normalization

Paper • 2503.10622 • Published Mar 13 • 162
I-Con: A Unifying Framework for Representation Learning

Paper • 2504.16929 • Published 15 days ago • 29

Redundancy Principles for MLLMs Benchmarks

Paper • 2501.13953 • Published Jan 20 • 30
Autonomy-of-Experts Models

Paper • 2501.13074 • Published Jan 22 • 45
Distillation Scaling Laws

Paper • 2502.08606 • Published Feb 12 • 48
Large Language Diffusion Models

Paper • 2502.09992 • Published Feb 14 • 115

Distillation Scaling Laws

Paper • 2502.08606 • Published Feb 12 • 48
I-Con: A Unifying Framework for Representation Learning

Paper • 2504.16929 • Published 15 days ago • 29

about 2 hours ago

deepseek-ai/DeepSeek-R1

Text Generation • Updated Mar 27 • 1.31M • • 12.1k
deepseek-ai/DeepSeek-V3

Text Generation • Updated Mar 27 • 599k • • 3.83k
mistralai/Mistral-Small-24B-Instruct-2501

Text Generation • Updated Feb 2 • 786k • • 905
deepseek-ai/Janus-Pro-1B

Any-to-Any • Updated Feb 1 • 32.3k • 434

Previous
1
2
Next

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs