-
Notifications
You must be signed in to change notification settings - Fork 972
Add Hugging Face as a provider #1911
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
hanouticelina
wants to merge
16
commits into
pydantic:main
Choose a base branch
from
hanouticelina:hf-inference-providers
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
16 commits
Select commit
Hold shift + click to select a range
2c3b9cb
add hf inference providers support
hanouticelina 537a657
update dependencies
hanouticelina af602a5
nit
hanouticelina 1f3f7a2
update docstring
hanouticelina bea050c
add tests
hanouticelina 40aef2e
add docs and known models for hf
hanouticelina af6fa42
Merge branch 'main' of github.com:hanouticelina/pydantic-ai into hf-i…
hanouticelina 7a4b9a4
fix imports in test
hanouticelina a153081
fix tests
hanouticelina 2f0ec51
fix provider test
hanouticelina 69aee55
adapt cli test
hanouticelina f68dace
re-record vcr cassettes
hanouticelina cc982e5
fix token name
hanouticelina 00da46e
fix examples test
hanouticelina bb20a34
Merge remote-tracking branch 'origin/main' into hf-inference-providers
Kludex 922fd13
Add API docs and refactor a bit the wording
Kludex File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
# `pydantic_ai.models.huggingface` | ||
|
||
## Setup | ||
|
||
For details on how to set up authentication with this model, see [model configuration for Hugging Face](../../models/huggingface.md). | ||
|
||
::: pydantic_ai.models.huggingface |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,91 @@ | ||
# Hugging Face | ||
|
||
## Install | ||
|
||
To use `HuggingFaceModel`, you need to either install `pydantic-ai`, or install `pydantic-ai-slim` with the `huggingface` optional group: | ||
|
||
```bash | ||
pip/uv-add "pydantic-ai-slim[huggingface]" | ||
``` | ||
|
||
## Configuration | ||
|
||
To use [HuggingFace](https://huggingface.co/) through their main API, go to | ||
[Inference Providers documentation](https://huggingface.co/docs/inference-providers/pricing) for all the details, | ||
and you can generate a Hugging Face access token here: https://huggingface.co/settings/tokens. | ||
|
||
## Hugging Face access token | ||
|
||
Once you have a Hugging Face access token, you can set it as an environment variable: | ||
|
||
```bash | ||
export HF_TOKEN='hf_token' | ||
``` | ||
|
||
You can then use [`HuggingFaceModel`][pydantic_ai.models.huggingface.HuggingFaceModel] by name: | ||
|
||
```python | ||
from pydantic_ai import Agent | ||
|
||
agent = Agent('huggingface:Qwen/Qwen3-235B-A22B') | ||
... | ||
``` | ||
|
||
Or initialise the model directly with just the model name: | ||
|
||
```python | ||
from pydantic_ai import Agent | ||
from pydantic_ai.models.huggingface import HuggingFaceModel | ||
|
||
model = HuggingFaceModel('Qwen/Qwen3-235B-A22B') | ||
agent = Agent(model) | ||
... | ||
``` | ||
|
||
By default, the [`HuggingFaceModel`][pydantic_ai.models.huggingface.HuggingFaceModel] uses the | ||
[`HuggingFaceProvider`][pydantic_ai.providers.huggingface.HuggingFaceProvider] that will select automatically | ||
the first of the inference providers (Cerebras, Together AI, Cohere..etc) available for the model, sorted by your | ||
preferred order in https://hf.co/settings/inference-providers. | ||
|
||
## Configure the provider | ||
|
||
If you want to pass parameters in code to the provider, you can programmatically instantiate the | ||
[`HuggingFaceProvider`][pydantic_ai.providers.huggingface.HuggingFaceProvider] and pass it to the model: | ||
|
||
```python | ||
from pydantic_ai import Agent | ||
from pydantic_ai.models.huggingface import HuggingFaceModel | ||
from pydantic_ai.providers.huggingface import HuggingFaceProvider | ||
|
||
model = HuggingFaceModel('Qwen/Qwen3-235B-A22B', provider=HuggingFaceProvider(api_key='hf_token', provider='nebius')) | ||
agent = Agent(model) | ||
... | ||
``` | ||
|
||
## Custom Hugging Face client | ||
|
||
[`HuggingFaceProvider`][pydantic_ai.providers.huggingface.HuggingFaceProvider] also accepts a custom | ||
[`AsyncInferenceClient`][huggingface_hub.AsyncInferenceClient] client via the `hf_client` parameter, so you can customise | ||
the `headers`, `bill_to` (billing to an HF organization you're a member of), `base_url` etc. as defined in the | ||
[Hugging Face Hub python library docs](https://huggingface.co/docs/huggingface_hub/package_reference/inference_client). | ||
|
||
```python | ||
from huggingface_hub import AsyncInferenceClient | ||
|
||
from pydantic_ai import Agent | ||
from pydantic_ai.models.huggingface import HuggingFaceModel | ||
from pydantic_ai.providers.huggingface import HuggingFaceProvider | ||
|
||
client = AsyncInferenceClient( | ||
bill_to='openai', | ||
api_key='hf_token', | ||
provider='fireworks-ai', | ||
) | ||
|
||
model = HuggingFaceModel( | ||
'Qwen/Qwen3-235B-A22B', | ||
provider=HuggingFaceProvider(hf_client=client), | ||
) | ||
agent = Agent(model) | ||
... | ||
``` |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How can we keep the list of those models up-to-date? Do you folks have an endpoint that we can call to list a lot of them, or something?