-
Notifications
You must be signed in to change notification settings - Fork 525
VertexAI Model-Registry
& Model-Deployer
#3161
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Conversation
- Update model source URI retrieval in VertexAIModelRegistry. - Enhance BaseModelDeployer to check and start inactive services. - Set default replica counts to 1 and sync to False in VertexBaseConfig. - Rename and update documentation for deployment service creation in VertexModelDeployer.
Important Review skippedAuto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
Documentation and Community
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left initial thoughts. Also should we add these to the 1-stack deployment of the GCP full stack endpoint?
src/zenml/integrations/gcp/flavors/vertex_model_deployer_flavor.py
Outdated
Show resolved
Hide resolved
src/zenml/integrations/gcp/model_registries/vertex_model_registry.py
Outdated
Show resolved
Hide resolved
self.setup_aiplatform() | ||
try: | ||
model_version = aiplatform.ModelVersion( | ||
model_name=f"{name}@{version}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Above for the display name we use name_version
, are you sure this is the correct name to delete? Where to we specify it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have to revive this thread too: you use name@version
here, but this is different than the convention you use when you create the model version, which is name_version
.
Besides this, I strongly suggest that you stop using this flat naming scheme in VertexAI and actually start using the Vertex AI model versions and properly mapping them to ZenML model versions instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After reading the VertexAI docs, it was apparent to me that they actually support this type of syntax, where X@Y
really means version X of model Y
. However, this only works with model IDs, not with the model display name.
src/zenml/integrations/gcp/model_registries/vertex_model_registry.py
Outdated
Show resolved
Hide resolved
…rity and consistency
…sertions in tests
…ithub.com/zenml-io/zenml into feature/vertex-ai-deployer-model-registry
…ice for improved readability and consistency
…delRegistry and VertexDeploymentService
ZenML CLI Performance Comparison (Threshold: 1.0s, Timeout: 60s, Slow: 5s)❌ Failed Commands on Current Branch (feature/vertex-ai-deployer-model-registry)
🚨 New Failures IntroducedThe following commands fail on your branch but worked on the target branch:
Performance Comparison
Summary
Environment Info
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still have to re-review the model deployer code, but I thought I'd just open up a discussion based on what I could find so far.
class VertexAIModelRegistryConfig( | ||
BaseModelRegistryConfig, | ||
GoogleCredentialsConfigMixin, | ||
VertexAIModelConfig, | ||
): | ||
"""Configuration for the VertexAI model registry. | ||
|
||
This configuration combines: | ||
- Base model registry configuration | ||
- Google Cloud authentication | ||
- Vertex AI model configuration | ||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just curious: why did you include the Vertex AI model configuration here as component configuration attributes? You wouldn't normally expect model instance attributes like artifact_uri
, model_source_spec
, model_source_info
, training_pipeline_id
and training_pipeline_display_name
to be used to configure the model registry itself.
src/zenml/integrations/gcp/model_registries/vertex_model_registry.py
Outdated
Show resolved
Hide resolved
src/zenml/integrations/gcp/model_registries/vertex_model_registry.py
Outdated
Show resolved
Hide resolved
@property | ||
def managed_by(self) -> str: | ||
"""Returns the managed by attribute. | ||
|
||
Returns: | ||
The managed by attribute. | ||
""" | ||
return "zenml" | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@property | |
def managed_by(self) -> str: | |
"""Returns the managed by attribute. | |
Returns: | |
The managed by attribute. | |
""" | |
return "zenml" |
This property doesn't do anything. It's not explicitly called anywhere and it's not dumped with the rest of the model when calling metadata.model_dump()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But this is used at filtering level to get only model version registered by zenml, but upper comment make sense as this should be using the exact model registry component not just zenml
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@safoinme You're correct that you use filter_expr = "labels.managed_by=zenml"
to filter model versions, but you set this label explicitly (see: labels["managed_by"] = "zenml"
) whereas this particular property is really unused.
upload_arguments = { | ||
"serving_container_image_uri": serving_container_image_uri, | ||
"artifact_uri": model_source_uri or self.config.artifact_uri, | ||
"is_default_version": self.config.is_default_version | ||
if self.config.is_default_version is not None | ||
else True, | ||
"version_aliases": self.config.version_aliases, | ||
"version_description": self.config.version_description, | ||
"serving_container_predict_route": self.config.container.predict_route | ||
if self.config.container | ||
else None, | ||
"serving_container_health_route": self.config.container.health_route | ||
if self.config.container | ||
else None, | ||
"description": description or self.config.description, | ||
"serving_container_command": self.config.container.command | ||
if self.config.container | ||
else None, | ||
"serving_container_args": self.config.container.args | ||
if self.config.container | ||
else None, | ||
"serving_container_environment_variables": self.config.container.env | ||
if self.config.container | ||
else None, | ||
"serving_container_ports": self.config.container.ports | ||
if self.config.container | ||
else None, | ||
"display_name": self.config.display_name or model_display_name, | ||
"project": self.config.project_id, | ||
"location": self.config.location, | ||
"labels": labels, | ||
"encryption_spec_key_name": self.config.encryption_spec_key_name, | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you look at this, you'll notice that you mostly use attributes from the Model Registry stack component configuration to configure the model. This significantly restricts your implementation, because you will need to register different model registry instances in order to customize any of these parameters to register different models.
This is a fundamental design limitation. What you should do instead is allow users to override some or all of these parameters for each individual model that they register, by using the metadata, just like you did with serving_container_image_uri
. You can even go as far as validating and converting the metadata into a VertexAIModelConfig
object or some subset of it that you allow to be overridden for individual models.
# leveraging extra settings from self.config. | ||
upload_arguments = { | ||
"serving_container_image_uri": serving_container_image_uri, | ||
"artifact_uri": model_source_uri or self.config.artifact_uri, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the sort of thing that I mentioned: it makes no sense to have the model artifact URI modeled as a Model Registry stack component configuration attribute. Who would configura a Model Registry stack component for a single model ?? And this is just one example, there are many others just like it.
model = self._init_vertex_model(name=name, version=version) | ||
assert isinstance(model, aiplatform.Model) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if the model is no longer present in Vertex AI? model
would be None
in that case and this assert would be triggered.
parent_model = self._init_vertex_model(name=name, version=version) | ||
assert isinstance(parent_model, aiplatform.Model) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here: what if the model is no longer present in Vertex AI? parent_model
would be None
in that case and this assert would be triggered.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Second batch of comments, focused mostly around the model deployer functionality.
) | ||
|
||
|
||
def sanitize_vertex_label(value: str) -> str: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You already have a function with the exact implementation in the model registry. Can you please move it somewhere common in a util module to avoid duplication?
The VertexDeploymentService instance | ||
""" | ||
# Initialize client with fresh credentials | ||
self._init_vertex_client() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are still configuring the aiplatform library globally with credentials, which is not a good idea when you have a stack that can use multiple different components with credentials coming from multiple different sources like service connectors.
What you should do here is to fetch the stack component's credentials and pass them to the VertexDeploymentService._credentials
private attribute.
default_factory=lambda: VertexServiceStatus() | ||
) | ||
_project_id: Optional[str] = PrivateAttr(default=None) | ||
_credentials: Optional[Any] = PrivateAttr(default=None) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You modeled the credentials as a private attribute and even use it everywhere in the calls you make in this class to Vertex AI, but you're not actually setting the attribute in the model deployer implementation. This means you're still using the global Vertex AI credentials instead of those passed through _credentials
.
deploy_kwargs.update( | ||
{ | ||
"container_image_uri": self.config.container.image_uri, | ||
"container_ports": self.config.container.ports, | ||
"container_predict_route": self.config.container.predict_route, | ||
"container_health_route": self.config.container.health_route, | ||
"container_env": self.config.container.env, | ||
} | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see this type of configuration information present in at least 4 places:
- the model registry stack component configuration (I already mentioned this might be a design problem)
- the model version stored in the Vertex AI model registry (currently populated from the Vertex AI model registry stack component configuration, but it should be possible to override it on a per-model basis, as I mentioned earlier)
- the model deployer stack configuration
- the model deployment configuration passed by users
How do all these 4 places of origin for the same type of configuration relate to each-other? More concretely:
- are the values stored in the Vertex AI model version used as a default if not set in the model deployment configuration?
- what happens with the values stored in the model deployer stack component configuration ? are those used at all?
- should these values even be configured in the model version when stored in the model registry? or should they only be part of the model deployment ?
supplied by the user in the model deployment configuration and the same information configured in the model registry
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the configurations in the stack component those are changed based on your comments, which totally make sense, but this config will still be for model registry model version and the service deployment configuration as mentioned earlier there is 2 scenarios one where model is deployed using zenml and in that case this case the service one can override the model version one otherwise model version is default one which can be deployed even from UI
model_version = model_registry.register_model_version( | ||
name=current_model.name, | ||
version=str(current_model.version), | ||
model_source_uri=current_model.get_model_artifact("sklearn_classifier").uri, | ||
description="ZenML model version registered with extended configuration", | ||
metadata=ModelRegistryModelMetadata( | ||
zenml_pipeline_name=get_step_context().pipeline.name, | ||
zenml_pipeline_run_uuid=str(get_step_context().pipeline_run.id), | ||
zenml_step_name=get_step_context().step_run.name, | ||
), | ||
config=model_config, | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This documentation code snippet looks exactly how I imagined the Vertex AI model registry would work: the register_model_version
method takes in a config
attribute where you can customize configuration parameters that will be stored in the Vertex AI model and used during deployment.
However, you'll notice that the documentation is incorrect: there is no config
argument that the current implementation permits. In fact, these configuration values are taken from the Model Registry stack component configuration instead, which is not as good a design because it's less flexible.
I think you should make this possible: allow config
to be passed as an argument and use it to override whatever comes from the model registry stack component configuration.
class VertexModelDeployerConfig( | ||
BaseModelDeployerConfig, | ||
GoogleCredentialsConfigMixin, | ||
VertexAIEndpointConfig, | ||
): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar to the model registry: you allow all these model specific configuration attributes like artifact_uri
, model_source_spec
, model_source_info
, training_pipeline_id
and training_pipeline_display_name
to be configured as stack component configuration attributes, but they don't really make sense as global attributes. Moreover, you are not using the stack component configuration attributes at all from what I could see, not even as defaults.
Co-authored-by: Stefan Nica <[email protected]>
Co-authored-by: Stefan Nica <[email protected]>
Co-authored-by: Stefan Nica <[email protected]>
…try.py Co-authored-by: Stefan Nica <[email protected]>
…ithub.com/zenml-io/zenml into feature/vertex-ai-deployer-model-registry
Describe changes
TODO
Pre-requisites
Please ensure you have done the following:
develop
and the open PR is targetingdevelop
. If your branch wasn't based on develop read Contribution guide on rebasing branch to develop.Types of changes