Byte Latent Transformer (BLT)
This repository contains the model weights for Meta's Byte Latent Transformer (BLT) model.
Model Structure
This repository contains:
blt_1b/
: BLT-1B model weights in PyTorch formatblt_7b/
: BLT-7B model weights in PyTorch formatentropy_model/
: Entropy model weights for dynamic patching in PyTorch formatsafetensors/
: All model weights converted to SafeTensors format
SafeTensors Format
This repository includes model weights in SafeTensors format, which offers:
- Faster loading times
- Better memory efficiency
- Improved security
Loading SafeTensors
from safetensors.torch import load_file
# Load BLT-1B model
model_weights = load_file('safetensors/blt_1b/consolidated.safetensors')
# Load entropy model
entropy_weights = load_file('safetensors/entropy_model/consolidated.safetensors')
Abstract
We introduce the Byte Latent Transformer architecture (BLTs), a new byte-level LLM architecture that for the first time, matches tokenization-based LLM performance at scale, with significant improvements in inference efficiency and robustness. BLT encodes bytes into dynamically sized patches, which serve as the primary units of computation. Patches are segmented dynamically based on the entropy of the next byte, allocating more compute and model capacity where there is more data complexity. The BLT architecture includes new attention mechanisms to maximize the information flow between byte and patch hidden representations and a new type of byte-sequence memory.