DeepInfra on Hugging Face Inference Providers 🔥

Summary (EN)

Hugging Face announced that DeepInfra is now a supported Inference Provider on the Hub, expanding the set of serverless model backends available directly from model pages and client SDKs. The integration allows developers to call DeepInfra-hosted models through Hugging Face’s router using either their own provider API key or Hugging Face-routed billing, with the latter charging standard provider API rates and applying usage to a Hugging Face account. The post says DeepInfra brings a catalog of more than 100 models and initially enables conversational and text-generation workloads on Hugging Face, including access to open-weight models such as DeepSeek V4, Kimi-K2.6, and GLM-5.1. Support for additional task types, including text-to-image, text-to-video, and embeddings, is described as forthcoming. Hugging Face also notes that the provider is integrated into its Python and JavaScript SDKs, so developers can use the same router endpoint and familiar OpenAI-compatible client patterns to invoke models hosted by DeepInfra. On the product side, users can set provider API keys in account settings, rank provider preferences, and see compatible third-party providers surfaced directly on model pages. The article presents the integration as a practical expansion of deployment choice and price-performance flexibility for application builders, while preserving a unified interface across providers and models inside the Hugging Face ecosystem.

Summary (ZH)

Hugging Face 宣布 DeepInfra 已正式接入其 Inference Providers 体系，进一步扩展了可直接在 Hub 模型页和客户端 SDK 中调用的无服务器推理后端。根据文章，开发者现在可以通过 Hugging Face 的统一路由来访问 DeepInfra 托管的模型，既可以使用自己的 DeepInfra API Key 直接计费，也可以选择由 Hugging Face 代路由并在 Hugging Face 账户中按标准提供商费率结算。Hugging Face 表示，DeepInfra 当前带来了 100 多个模型目录，首批在平台内支持的任务主要是对话与文本生成，覆盖 DeepSeek V4、Kimi-K2.6、GLM-5.1 等开放权重模型；文本生成之外的图像、视频、向量嵌入等任务也被列为后续扩展方向。文章还指出，这一接入已打通 Hugging Face 的 Python 与 JavaScript SDK，开发者可以继续沿用统一的 router 入口和兼容 OpenAI 风格的调用方式，不必为不同提供商分别改写接入逻辑。对产品层面而言，用户还可在账号设置中配置自己的提供商密钥、调整提供商优先级，并在模型详情页直接看到兼容的第三方推理服务。整体来看，这次集成强化了 Hugging Face 生态中的部署选择、价格性能弹性和统一开发体验，面向的是希望快速接入多模型推理能力的应用开发者。

Source

https://huggingface.co/blog/inference-providers-deepinfra