Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Posts
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
abotresol 's Collections
Foundational Models
Interpretability and llms
reinforcement learning llms
Encoders llm
Tokenizers
Agents
Llms Ops
LLMs and memory
More efficient sequence modelling
Language Modelling Arc
Llms writing skills
Llms and reasoning
basic-blocs
Image-gen-models

basic-blocs

updated 11 days ago
Upvote
-

  • Tensor Product Attention Is All You Need

    Paper • 2501.06425 • Published Jan 11 • 88

  • TransMLA: Multi-head Latent Attention Is All You Need

    Paper • 2502.07864 • Published Feb 11 • 50

  • Union of Experts: Adapting Hierarchical Routing to Equivalently Decomposed Transformer

    Paper • 2503.02495 • Published Mar 4 • 8

  • BitNet v2: Native 4-bit Activations with Hadamard Transformation for 1-bit LLMs

    Paper • 2504.18415 • Published 13 days ago • 41
Upvote
-
  • Collection guide
  • Browse collections
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs