Byte Latent Transformer (BLT)

This repository contains the model weights for Meta's Byte Latent Transformer (BLT) model.

Model Structure

This repository contains:

blt_1b/: BLT-1B model weights in PyTorch format
blt_7b/: BLT-7B model weights in PyTorch format
entropy_model/: Entropy model weights for dynamic patching in PyTorch format
safetensors/: All model weights converted to SafeTensors format

SafeTensors Format

This repository includes model weights in SafeTensors format, which offers:

Faster loading times
Better memory efficiency
Improved security

Loading SafeTensors

from safetensors.torch import load_file

# Load BLT-1B model
model_weights = load_file('safetensors/blt_1b/consolidated.safetensors')

# Load entropy model
entropy_weights = load_file('safetensors/entropy_model/consolidated.safetensors')

Abstract

We introduce the Byte Latent Transformer architecture (BLTs), a new byte-level LLM architecture that for the first time, matches tokenization-based LLM performance at scale, with significant improvements in inference efficiency and robustness. BLT encodes bytes into dynamically sized patches, which serve as the primary units of computation. Patches are segmented dynamically based on the entropy of the next byte, allocating more compute and model capacity where there is more data complexity. The BLT architecture includes new attention mechanisms to maximize the information flow between byte and patch hidden representations and a new type of byte-sequence memory.

ttj
/

blt

Byte Latent Transformer (BLT)

Model Structure

SafeTensors Format

Loading SafeTensors

Abstract

Original Repository