-
Notifications
You must be signed in to change notification settings - Fork 972
Add Hugging Face as a provider #1911
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Add Hugging Face as a provider #1911
Conversation
Great! :) How can I help here? |
Hi @Kludex, |
Amazing! :) |
PR Change SummaryAdded support for Hugging Face as a model provider in the API, enabling users to utilize Hugging Face's Inference Providers.
Added Files
How can I customize these reviews?Check out the Hyperlint AI Reviewer docs for more information on how to customize the review. If you just want to ignore it on this PR, you can add the Note specifically for link checks, we only check the first 30 links in a file and we cache the results for several hours (for instance, if you just added a page, you might experience this). Our recommendation is to add |
…nference-providers
Hi @Kludex, |
Hi @Kludex, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Adds first-class support for Hugging Face Inference Providers by plugging in a new HuggingFaceProvider
/HuggingFaceModel
pair, updating tests, CLI, docs, and optional dependencies.
- Introduce
HuggingFaceProvider
andHuggingFaceModel
inpydantic_ai_slim
- Extend tests and CLI to recognize
huggingface
as a provider - Add documentation and update
pyproject.toml
for thehuggingface
optional group
Reviewed Changes
Copilot reviewed 15 out of 16 changed files in this pull request and generated 1 comment.
Show a summary per file
File | Description |
---|---|
pydantic_ai_slim/pydantic_ai/providers/huggingface.py | New provider implementation; wiring environment API key and client setup |
pydantic_ai_slim/pydantic_ai/models/huggingface.py | New model implementation handling sync/async and streaming |
docs/models/huggingface.md | Documentation for installing and configuring HF provider |
pyproject.toml | Added huggingface to optional dependencies |
tests/providers/test_huggingface.py | New unit tests for provider initialization/errors |
tests/models/test_huggingface.py | Extensive tests covering completions, streaming, error handling |
Comments suppressed due to low confidence (2)
pydantic_ai_slim/pydantic_ai/providers/huggingface.py:41
- The docstring states the default provider is "auto", but the
provider
parameter defaults toNone
. Please update the doc to match the code or adjust the default value accordingly.
provider: str | None = None,
docs/models/huggingface.md:9
- [nitpick] The install command looks like a typo. Consider replacing
pip/uv-add
withpip install
to avoid confusion.
pip/uv-add "pydantic-ai-slim[huggingface]"
|
||
@property | ||
def base_url(self) -> str: | ||
return self.client.model # type: ignore |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The base_url
property is returning self.client.model
instead of the actual base URL. It should return something like self._client.base_url
(or store base_url
on init) so that provider.base_url
reflects the configured endpoint.
return self.client.model # type: ignore | |
return self._base_url |
Copilot uses AI. Check for mistakes.
'huggingface:Qwen/QwQ-32B', | ||
'huggingface:Qwen/Qwen2.5-72B-Instruct', | ||
'huggingface:Qwen/Qwen3-235B-A22B', | ||
'huggingface:Qwen/Qwen3-32B', | ||
'huggingface:deepseek-ai/DeepSeek-R1', | ||
'huggingface:meta-llama/Llama-3.3-70B-Instruct', | ||
'huggingface:meta-llama/Llama-4-Maverick-17B-128E-Instruct', | ||
'huggingface:meta-llama/Llama-4-Scout-17B-16E-Instruct', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How can we keep the list of those models up-to-date? Do you folks have an endpoint that we can call to list a lot of them, or something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wrote some comments here.
We need to create ThinkingPart
s (does the HuggingFace client handles them?) and we need to add code coverage.
I prefer tests strictly with VCR, if possible.
api_key: str | None = None, | ||
hf_client: AsyncInferenceClient | None = None, | ||
http_client: AsyncClient | None = None, | ||
provider: str | None = None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a bit weird that inside the class that is called provider, you can set a provider as well. Is there an alternative name here?
) | ||
|
||
if http_client is not None: | ||
raise ValueError('`http_client` is ignored for HuggingFace provider, please use `hf_client` instead') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
raise ValueError('`http_client` is ignored for HuggingFace provider, please use `hf_client` instead') | |
raise ValueError('`http_client` is ignored for HuggingFace provider, please use `hf_client` instead.') |
raise ValueError('`http_client` is ignored for HuggingFace provider, please use `hf_client` instead') | ||
|
||
if base_url is not None and provider is not None: | ||
raise ValueError('Cannot provide both `base_url` and `provider`') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
raise ValueError('Cannot provide both `base_url` and `provider`') | |
raise ValueError('Cannot provide both `base_url` and `provider`.') |
raise ValueError('Cannot provide both `base_url` and `provider`') | ||
|
||
if hf_client is None: | ||
self._client = AsyncInferenceClient(api_key=api_key, provider=provider, base_url=base_url) # type: ignore |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the type issue here?
def __init__( | ||
self, | ||
base_url: str | None = None, | ||
api_key: str | None = None, | ||
hf_client: AsyncInferenceClient | None = None, | ||
http_client: AsyncClient | None = None, | ||
provider: str | None = None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For a better user experience, it would be nice to create some overloads to reflect the ValueError
s we have below. But not a blocker.
@@ -291,6 +291,11 @@ def openrouter_api_key() -> str: | |||
return os.getenv('OPENROUTER_API_KEY', 'mock-api-key') | |||
|
|||
|
|||
@pytest.fixture(scope='session') | |||
def huggingface_api_key() -> str: | |||
return os.getenv('HF_TOKEN', 'hf_token') or os.getenv('HUGGINGFACE_API_KEY', 'hf_token') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's this HUGGINGFACE_API_KEY
? It's not mentioned or used in the code.
@hanouticelina If I can help to speed things up here, please let me know. |
@Kludex thank you! I'll be working on this PR in the next few days and get back to you if i need any help :) |
closes #1085.
Hi there, maintainer of
huggingface_hub
library 🤗 here,This PR introduces support for Hugging Face's Inference Providers (documentation here) as a Model Provider.
Our API is fully compatible with the OpenAI REST API spec, and the implementation closely mirrors the existing
OpenAIProvider
/OpenAIModel
pair. Under the hood, we use thehuggingface_hub.AsyncInferenceClient
client, which is a drop-in replacement of the async OpenAI client but includes provider-specific (de)serialization logic that cannot be reproduced reliably with the OpenAI client alone, see @Wauplin’s detailed explanation here).Note that
huggingface_hub
is a stable and widely used library that was already listed as a dependency in the lockfile.TODO: