Compilade

compilade

AI & ML interests

None yet

Recent Activity

Organizations

ngxson and friends's profile picture

compilade's activity

New activity in microsoft/bitnet-b1.58-2B-4T-gguf 17 days ago

TQ1 quant version

3
#7 opened 17 days ago by
TobDeBer
New activity in mradermacher/BabyHercules-4x150M-GGUF about 1 month ago
replied to bartowski's post 7 months ago
view reply

KLD measures the difference between 2 probability distributions, typically between a "ground truth" and a model prediction.

Yes, and ln(PPL(Q)/PPL(base)) from my understanding measures the difference between the probabilities for the "correct" tokens according to the test dataset (at least for the second half of each chunk (same as for KLD)). Which means it would be possible to somehow keep perplexity the same or better while also increasing KLD (by making the non-"correct" tokens have different probabilities).

This makes me wonder: do all of the token probabilities have to match closely for a quantized model to still be good?

I guess it depends on whether the goal is to make a faithful quantization, or an equally good model through quantization-aware fine-tuning.
The way imatrix works, it can't really "fine-tune" a model towards a lower perplexity, only prioritize error reduction in the quantization of the weights in the columns with more impact on the activations, so I would say that faithfulness to the full-precision model is the goal of the quantization in this case, and thus KLD feels more appropriate.

Of course, I might be wrong; I don't really have a full understanding of the statistics going on in perplexity and KL-divergence calculations.

However, for quantization-aware fine-tuning, then ln(PPL(Q)/PPL(base)) is likely a better indicator of a better quantization than KLD, unless the goal of the fine-tuning was actually to minimize KLD.

New activity in HF1BitLLM/Llama3-8B-1.58-100B-tokens 8 months ago

GGUF conversion

4
11
#3 opened 8 months ago by
compilade
New activity in mistralai/Mamba-Codestral-7B-v0.1 10 months ago

Update hardcoded filenames

5
1
#1 opened 10 months ago by
Wauplin
New activity in ai21labs/Jamba-v0.1 about 1 year ago
New activity in jondurbin/bagel-dpo-2.8b-v0.2 about 1 year ago

GGUF Please

3
#1 opened over 1 year ago by
HR1777
New activity in clibrain/mamba-2.8b-instruct-openhermes about 1 year ago

gguf

6
1
#1 opened over 1 year ago by
LaferriereJC
New activity in pansophic/rocket-3B over 1 year ago