PagezyTest

community

AI & ML interests

None defined yet.

pagezytest's activity

pagezyhfΒ 
posted an update 16 days ago
view post
Post
1930
If you haven't had the chance to test the latest open model from Meta, Llama 4 Maverick, go try it on AMD MI 300 on Hugging Face!

amd/llama4-maverick-17b-128e-mi-amd
pagezyhfΒ 
posted an update 3 months ago
view post
Post
1748
We published https://huggingface.co/blog/deepseek-r1-aws!

If you are using AWS, give a read. It is a running document to showcase how to deploy and fine-tune DeepSeek R1 models with Hugging Face on AWS.

We're working hard to enable all the scenarios, whether you want to deploy to Inference Endpoints, Sagemaker or EC2; with GPUs or with Trainium & Inferentia.

We have full support for the distilled models, DeepSeek-R1 support is coming soon!! I'll keep you posted.

Cheers
  • 1 reply
Β·
pagezyhfΒ 
posted an update 4 months ago
pagezyhfΒ 
posted an update 5 months ago
pagezyhfΒ 
posted an update 5 months ago
view post
Post
982
It’s 2nd of December , here’s your Cyber Monday present 🎁 !

We’re cutting our price down on Hugging Face Inference Endpoints and Spaces!

Our folks at Google Cloud are treating us with a 40% price cut on GCP Nvidia A100 GPUs for the next 3️⃣ months. We have other reductions on all instances ranging from 20 to 50%.

Sounds like the time to give Inference Endpoints a try? Get started today and find in our documentation the full pricing details.
https://ui.endpoints.huggingface.co/
https://huggingface.co/pricing
pagezyhfΒ 
posted an update 5 months ago
view post
Post
310
Hello Hugging Face Community,

if you use Google Kubernetes Engine to host you ML workloads, I think this series of videos is a great way to kickstart your journey of deploying LLMs, in less than 10 minutes! Thank you @wietse-venema-demo !

To watch in this order:
1. Learn what are Hugging Face Deep Learning Containers
https://youtu.be/aWMp_hUUa0c?si=t-LPRkRNfD3DDNfr

2. Learn how to deploy a LLM with our Deep Learning Container using Text Generation Inference
https://youtu.be/Q3oyTOU1TMc?si=V6Dv-U1jt1SR97fj

3. Learn how to scale your inference endpoint based on traffic
https://youtu.be/QjLZ5eteDds?si=nDIAirh1r6h2dQMD

If you want more of these small tutorials and have any theme in mind, let me know!
pagezyhfΒ 
posted an update 6 months ago
view post
Post
1373
Hello Hugging Face Community,

I'd like to share here a bit more about our Deep Learning Containers (DLCs) we built with Google Cloud, to transform the way you build AI with open models on this platform!

With pre-configured, optimized environments for PyTorch Training (GPU) and Inference (CPU/GPU), Text Generation Inference (GPU), and Text Embeddings Inference (CPU/GPU), the Hugging Face DLCs offer:

⚑ Optimized performance on Google Cloud's infrastructure, with TGI, TEI, and PyTorch acceleration.
πŸ› οΈ Hassle-free environment setup, no more dependency issues.
πŸ”„ Seamless updates to the latest stable versions.
πŸ’Ό Streamlined workflow, reducing dev and maintenance overheads.
πŸ”’ Robust security features of Google Cloud.
☁️ Fine-tuned for optimal performance, integrated with GKE and Vertex AI.
πŸ“¦ Community examples for easy experimentation and implementation.
πŸ”œ TPU support for PyTorch Training/Inference and Text Generation Inference is coming soon!

Find the documentation at https://huggingface.co/docs/google-cloud/en/index
If you need support, open a conversation on the forum: https://discuss.huggingface.co/c/google-cloud/69