As mentioned, we’ve open-sourced our benchmarking code here: https://github.com/keyboardAnt/hf-bench
Nadav Timor
Nadav-Timor
AI & ML interests
None yet
Recent Activity
commented on
an
article
about 1 month ago
Speeding Up LLM Decoding with Advanced Universal Assisted Generation Techniques
commented on
their
article
about 2 months ago
Universal Assisted Generation: Faster Decoding with Any Assistant Model
published
an
article
about 2 months ago
Speeding Up LLM Decoding with Advanced Universal Assisted Generation Techniques
Organizations
Nadav-Timor's activity

commented on
Speeding Up LLM Decoding with Advanced Universal Assisted Generation Techniques
about 1 month ago

commented on
Universal Assisted Generation: Faster Decoding with Any Assistant Model
about 2 months ago
Citation
@article
{timor2025acceleratingllminferencelossless,
title={Accelerating LLM Inference with Lossless Speculative Decoding Algorithms for Heterogeneous Vocabularies},
author={Nadav Timor and Jonathan Mamou and Daniel Korat and Moshe Berchansky and Oren Pereg and Gaurav Jain and Roy Schwartz and Moshe Wasserblat and David Harel},
year={2025},
eprint={2502.05202},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2502.05202},
}

published
an
article
about 2 months ago
Article
Speeding Up LLM Decoding with Advanced Universal Assisted Generation Techniques
By
and 8 others
•
•
18Vocab size in config.json mismatches the actual tokenizer size
5
#4 opened 3 months ago
by
Fizzarolli


upvoted
a
paper
5 months ago

published
an
article
6 months ago
Article
Universal Assisted Generation: Faster Decoding with Any Assistant Model
By
and 7 others
•
•
55
upvoted
an
article
7 months ago
Article
Faster Assisted Generation with Dynamic Speculation
By
and 6 others
•
•
46
published
an
article
7 months ago
Article
Faster Assisted Generation with Dynamic Speculation
By
and 6 others
•
•
46
upvoted
a
paper
11 months ago

upvoted
a
collection
12 months ago

upvoted
a
paper
12 months ago
`max_position_embeddings=32768` with "attention span of 131K tokens"
1
#57 opened over 1 year ago
by
Nadav-Timor

Space isn't working because there is a runtime error
3
#1 opened over 1 year ago
by
Nadav-Timor
