AI & ML interests

Accelerating DL

Recent Activity

JingyaΒ  updated a model about 1 month ago
optimum/pixart_sigma_pipe_xl_2_512_ms_neuronx
JingyaΒ  published a model about 1 month ago
optimum/pixart_sigma_pipe_xl_2_512_ms_neuronx
JingyaΒ  updated a model about 1 month ago
optimum/bge-base-en-v1.5-neuronx
View all activity

optimum's activity

jeffboudierΒ 
posted an update 1 day ago
view post
Post
1287
So many orgs on HF would really benefit from security and governance built into Enterprise Hub - I wrote a guide on why and how upgrade: jeffboudier/how-to-upgrade-to-enterprise

For instance, did you know about Resource Groups?
pagezyhfΒ 
posted an update 16 days ago
view post
Post
1930
If you haven't had the chance to test the latest open model from Meta, Llama 4 Maverick, go try it on AMD MI 300 on Hugging Face!

amd/llama4-maverick-17b-128e-mi-amd
jeffboudierΒ 
posted an update about 1 month ago
view post
Post
2188
Llama4 is out and Scout is already on the Dell Enterprise Hub to deploy on Dell systems πŸ‘‰ dell.huggingface.co
JingyaΒ 
published a model about 1 month ago
jeffboudierΒ 
posted an update about 1 month ago
view post
Post
1552
Enterprise orgs now enable serverless Inference Providers for all members
- includes $2 free usage per org member (e.g. an Enterprise org with 1,000 members share $2,000 free credit each month)
- admins can set a monthly spend limit for the entire org
- works today with Together, fal, Novita, Cerebras and HF Inference.

Here's the doc to bill Inference Providers usage to your org: https://huggingface.co/docs/inference-providers/pricing#organization-billing
  • 2 replies
Β·
regisssΒ 
posted an update 3 months ago
view post
Post
1731
Nice paper comparing the fp8 inference efficiency of Nvidia H100 and Intel Gaudi2: An Investigation of FP8 Across Accelerators for LLM Inference (2502.01070)

The conclusion is interesting: "Our findings highlight that the Gaudi 2, by leveraging FP8, achieves higher throughput-to-power efficiency during LLM inference"

One aspect of AI hardware accelerators that is often overlooked is how they consume less energy than GPUs. It's nice to see researchers starting carrying out experiments to measure this!

Gaudi3 results soon...
pagezyhfΒ 
posted an update 3 months ago
view post
Post
1748
We published https://huggingface.co/blog/deepseek-r1-aws!

If you are using AWS, give a read. It is a running document to showcase how to deploy and fine-tune DeepSeek R1 models with Hugging Face on AWS.

We're working hard to enable all the scenarios, whether you want to deploy to Inference Endpoints, Sagemaker or EC2; with GPUs or with Trainium & Inferentia.

We have full support for the distilled models, DeepSeek-R1 support is coming soon!! I'll keep you posted.

Cheers
  • 1 reply
Β·
pagezyhfΒ 
posted an update 4 months ago
jeffboudierΒ 
posted an update 4 months ago
view post
Post
744
NVIDIA just announced the Cosmos World Foundation Models, available on the Hub: nvidia/cosmos-6751e884dc10e013a0a0d8e6

Cosmos is a family of pre-trained models purpose-built for generating physics-aware videos and world states to advance physical AI development.
The release includes Tokenizers nvidia/cosmos-tokenizer-672b93023add81b66a8ff8e6

Learn more in this great community article by @mingyuliutw and @PranjaliJoshi https://huggingface.co/blog/mingyuliutw/nvidia-cosmos
  • 1 reply
Β·
regisssΒ 
posted an update 5 months ago