view article Article Accelerating LLM Inference with TGI on Intel Gaudi By baptistecolle and 4 others • Mar 28 • 13
view article Article Benchmarking Language Model Performance on 5th Gen Xeon at GCP By MatrixYao and 2 others • Dec 17, 2024 • 5
view article Article AMD + 🤗: Large Language Models Out-of-the-Box Acceleration with AMD GPU Dec 5, 2023 • 3
view article Article Overview of natively supported quantization schemes in 🤗 Transformers By ybelkada and 4 others • Sep 12, 2023 • 12