[FEEDBACK] Inference Providers
Any inference provider you love, and that you'd like to be able to access directly from the Hub?
Love that I can call DeepSeek R1 directly from the Hub ๐ฅ
from huggingface_hub import InferenceClient
client = InferenceClient(
provider="together",
api_key="xxxxxxxxxxxxxxxxxxxxxxxx"
)
messages = [
{
"role": "user",
"content": "What is the capital of France?"
}
]
completion = client.chat.completions.create(
model="deepseek-ai/DeepSeek-R1",
messages=messages,
max_tokens=500
)
print(completion.choices[0].message)
Is it possible to set a monthly payment budget or rate limits for all the external providers? I don't see such options in billings tab. In case a key is or session token is stolen, it can be quite dangerous to my thin wallet:(
@benhaotang you already get spending notifications when crossing important thresholds ($10, $100, $1,000) but we'll add spending limits in the future
@benhaotang you already get spending notifications when crossing important thresholds ($10, $100, $1,000) but we'll add spending limits in the future
Thanks for your quick reply, good to know!
Would be great if you could add Nebius AI Studio to the list :) New inference provider on the market, with the absolute cheapest prices and the highest rate limits...
Could be good to add featherless.ai
TitanML !!
question around enterprise accounts:
- each user gets a $2 quota?
- in order to apply the quota they have to use a key from their ent account and the
X-HF-Bill-To
header? - how much does a call cost? I can't find any documentation...
Would be great to add Clarifai to the list. The platform is vendor-agnostic, supporting AWS, GCP, Vultr, and Oracle. We are planning to add a lot more providers soon.
Our company wants to provide some private models.
Is it possible in Model Mapping [ https://huggingface.co/docs/inference-providers/en/register-as-a-provider#3-model-mapping-api ] to have hfModel as a "stub" only and providerModel as a real model?
Just signed up with HF and had some questions for the general community to help us get started. We plan to use the Cerebras Inference Provider using direct calls rather than routing through HF itself.
With a Pro subscription, are there any limits to token usage or queuing constraints when using a custom API key and direct calls? The free tier on Cerebras did have such constraints.
Thanks in advance
Hey all, I'd like to make nCompass (https://docs.ncompass.tech/api-reference/quickstart) an inference provider on HF. We build GPU optimizations to be able to support an API without rate limits by maximizing GPU utilization. I would really appreciate it if someone could help us with the process of becoming an inference provider.
Hi I have a problem using Smoleagents HfApiModel Inference, I noticed that even though I belong to an enterprise organization, the inference API uses the credits of my free account
and not those of the organization
yet I read here (https://huggingface.co/docs/inference-providers/en/pricing)
that it should automatically use those of the organization.
I wrote even here (https://discuss.huggingface.co/t/hugging-face-payment-error-402-youve-exceeded-monthly-quota/144968/10?u=alexman83) and it seems there is a bug...
How we can solve it? For our company. is important use that service...
Thank you for you help!
With a Pro subscription, are there any limits to token usage or queuing constraints when using a custom API key and direct calls? The free tier on Cerebras did have such constraints.
@sh8459131 When using a custom key, requests are forwarded to Cerebras directly so their limits will apply
@alexman83 can you share some sample code you're using? We might need to update smolagents to expose the new bill_to parameter. cc @albertvillanova for viz