Beyond Chains of Thought: Benchmarking Latent-Space Reasoning Abilities in Large Language Models Paper • 2504.10615 • Published 24 days ago • 1
Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach Paper • 2502.05171 • Published Feb 7 • 140
— UI is a good thing 💅 — Collection cool spaces with a cool UI, what could be better? • 5 items • Updated 3 days ago • 17
Model Hubs and Beyond: Analyzing Model Popularity, Performance, and Documentation Paper • 2503.15222 • Published Mar 19 • 1
The AI Community Building the Future? A Quantitative Analysis of Development Activity on Hugging Face Hub Paper • 2405.13058 • Published May 20, 2024 • 2
SpaceByte: Towards Deleting Tokenization from Large Language Modeling Paper • 2404.14408 • Published Apr 22, 2024 • 7
T-FREE: Tokenizer-Free Generative LLMs via Sparse Representations for Memory-Efficient Embeddings Paper • 2406.19223 • Published Jun 27, 2024 • 11
Does Time Have Its Place? Temporal Heads: Where Language Models Recall Time-specific Information Paper • 2502.14258 • Published Feb 20 • 26
Foundation Text-Generation Models Below 360M Parameters Collection Great candidates for fine-tuning targeting Wllama and Transformers.js for mobile devices, ordered by number of parameters. • 36 items • Updated Apr 6 • 31
OmniMamba: Efficient and Unified Multimodal Understanding and Generation via State Space Models Paper • 2503.08686 • Published Mar 11 • 19