Skip to content

Add Hugging Face as a provider #1911

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 16 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 7 additions & 0 deletions docs/api/models/huggingface.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
# `pydantic_ai.models.huggingface`

## Setup

For details on how to set up authentication with this model, see [model configuration for Hugging Face](../../models/huggingface.md).

::: pydantic_ai.models.huggingface
2 changes: 2 additions & 0 deletions docs/api/providers.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,3 +29,5 @@
::: pydantic_ai.providers.heroku.HerokuProvider

::: pydantic_ai.providers.openrouter.OpenRouterProvider

::: pydantic_ai.providers.huggingface.HuggingFaceProvider
91 changes: 91 additions & 0 deletions docs/models/huggingface.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,91 @@
# Hugging Face

## Install

To use `HuggingFaceModel`, you need to either install `pydantic-ai`, or install `pydantic-ai-slim` with the `huggingface` optional group:

```bash
pip/uv-add "pydantic-ai-slim[huggingface]"
```

## Configuration

To use [HuggingFace](https://huggingface.co/) through their main API, go to
[Inference Providers documentation](https://huggingface.co/docs/inference-providers/pricing) for all the details,
and you can generate a Hugging Face access token here: https://huggingface.co/settings/tokens.

## Hugging Face access token

Once you have a Hugging Face access token, you can set it as an environment variable:

```bash
export HF_TOKEN='hf_token'
```

You can then use [`HuggingFaceModel`][pydantic_ai.models.huggingface.HuggingFaceModel] by name:

```python
from pydantic_ai import Agent

agent = Agent('huggingface:Qwen/Qwen3-235B-A22B')
...
```

Or initialise the model directly with just the model name:

```python
from pydantic_ai import Agent
from pydantic_ai.models.huggingface import HuggingFaceModel

model = HuggingFaceModel('Qwen/Qwen3-235B-A22B')
agent = Agent(model)
...
```

By default, the [`HuggingFaceModel`][pydantic_ai.models.huggingface.HuggingFaceModel] uses the
[`HuggingFaceProvider`][pydantic_ai.providers.huggingface.HuggingFaceProvider] that will select automatically
the first of the inference providers (Cerebras, Together AI, Cohere..etc) available for the model, sorted by your
preferred order in https://hf.co/settings/inference-providers.

## Configure the provider

If you want to pass parameters in code to the provider, you can programmatically instantiate the
[`HuggingFaceProvider`][pydantic_ai.providers.huggingface.HuggingFaceProvider] and pass it to the model:

```python
from pydantic_ai import Agent
from pydantic_ai.models.huggingface import HuggingFaceModel
from pydantic_ai.providers.huggingface import HuggingFaceProvider

model = HuggingFaceModel('Qwen/Qwen3-235B-A22B', provider=HuggingFaceProvider(api_key='hf_token', provider='nebius'))
agent = Agent(model)
...
```

## Custom Hugging Face client

[`HuggingFaceProvider`][pydantic_ai.providers.huggingface.HuggingFaceProvider] also accepts a custom
[`AsyncInferenceClient`][huggingface_hub.AsyncInferenceClient] client via the `hf_client` parameter, so you can customise
the `headers`, `bill_to` (billing to an HF organization you're a member of), `base_url` etc. as defined in the
[Hugging Face Hub python library docs](https://huggingface.co/docs/huggingface_hub/package_reference/inference_client).

```python
from huggingface_hub import AsyncInferenceClient

from pydantic_ai import Agent
from pydantic_ai.models.huggingface import HuggingFaceModel
from pydantic_ai.providers.huggingface import HuggingFaceProvider

client = AsyncInferenceClient(
bill_to='openai',
api_key='hf_token',
provider='fireworks-ai',
)

model = HuggingFaceModel(
'Qwen/Qwen3-235B-A22B',
provider=HuggingFaceProvider(hf_client=client),
)
agent = Agent(model)
...
```
1 change: 1 addition & 0 deletions mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -81,6 +81,7 @@ nav:
- api/models/gemini.md
- api/models/google.md
- api/models/groq.md
- api/models/huggingface.md
- api/models/instrumented.md
- api/models/mistral.md
- api/models/test.md
Expand Down
14 changes: 13 additions & 1 deletion pydantic_ai_slim/pydantic_ai/models/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -227,6 +227,14 @@
'heroku:claude-3-7-sonnet',
'heroku:claude-4-sonnet',
'heroku:claude-3-haiku',
'huggingface:Qwen/QwQ-32B',
'huggingface:Qwen/Qwen2.5-72B-Instruct',
'huggingface:Qwen/Qwen3-235B-A22B',
'huggingface:Qwen/Qwen3-32B',
'huggingface:deepseek-ai/DeepSeek-R1',
'huggingface:meta-llama/Llama-3.3-70B-Instruct',
'huggingface:meta-llama/Llama-4-Maverick-17B-128E-Instruct',
'huggingface:meta-llama/Llama-4-Scout-17B-16E-Instruct',
Comment on lines +230 to +237
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How can we keep the list of those models up-to-date? Do you folks have an endpoint that we can call to list a lot of them, or something?

'mistral:codestral-latest',
'mistral:mistral-large-latest',
'mistral:mistral-moderation-latest',
Expand Down Expand Up @@ -539,7 +547,7 @@ def override_allow_model_requests(allow_model_requests: bool) -> Iterator[None]:
ALLOW_MODEL_REQUESTS = old_value # pyright: ignore[reportConstantRedefinition]


def infer_model(model: Model | KnownModelName | str) -> Model:
def infer_model(model: Model | KnownModelName | str) -> Model: # noqa: C901
"""Infer the model from the name."""
if isinstance(model, Model):
return model
Expand Down Expand Up @@ -593,6 +601,10 @@ def infer_model(model: Model | KnownModelName | str) -> Model:
from .bedrock import BedrockConverseModel

return BedrockConverseModel(model_name, provider=provider)
elif provider == 'huggingface':
from .huggingface import HuggingFaceModel

return HuggingFaceModel(model_name, provider=provider)
else:
raise UserError(f'Unknown model: {model}') # pragma: no cover

Expand Down
Loading
Loading