-
Notifications
You must be signed in to change notification settings - Fork 529
VertexAI Model-Registry
& Model-Deployer
#3161
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Changes from 44 commits
2fb8a7d
c03f2a0
3c6bbe9
4eeeb27
7881b69
7c0ca3f
6769b6c
9a03f34
afc5c2b
2dc0d2d
5c5bb84
54b6748
a80f71a
a980449
6e2b660
0a13214
53da68d
ff015e1
ce2019d
14f2998
0b30a61
83dfe31
7888717
72cc93c
70cc4a9
fcdec6e
0108c0f
0c33f82
4f18ba5
012cd6e
ac2e69a
3194db3
3d558ae
1d5b7fa
a4e4b45
32e8059
e58c2f7
e236e8a
373177b
2e9b7c4
8f074ef
c7a8d15
9eedaca
0792f75
147658c
b3b8621
6e0e588
c8130f6
06b1f5f
47d45d4
97c96c4
5d38228
badc91c
564f46c
e91a3eb
0ff9e36
c59a5f2
c7a3e7e
ec0ae9c
d527293
35337cb
b2da984
ba60222
f0632b2
1cbf2c0
2f241bb
a3adaa6
c92c991
8773745
12553b9
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,187 @@ | ||
# Vertex AI Model Deployer | ||
|
||
[Vertex AI](https://cloud.google.com/vertex-ai) provides managed infrastructure for deploying machine learning models at scale. The Vertex AI Model Deployer in ZenML allows you to deploy models to Vertex AI endpoints, providing a scalable and fully managed solution for model serving. | ||
|
||
## When to use it? | ||
|
||
Use the Vertex AI Model Deployer when: | ||
|
||
- You are leveraging Google Cloud Platform (GCP) and wish to integrate with its native ML serving infrastructure. | ||
- You need enterprise-grade model serving capabilities complete with autoscaling and GPU acceleration. | ||
- You require a fully managed solution that abstracts away the operational overhead of serving models. | ||
- You need to deploy models directly from your Vertex AI Model Registry—or even from other registries or artifacts. | ||
- You want seamless integration with GCP services like Cloud Logging, IAM, and VPC. | ||
|
||
This deployer is especially useful for production deployments, high-availability serving, and dynamic scaling based on workloads. | ||
|
||
{% hint style="info" %} | ||
For best results, the Vertex AI Model Deployer works with a Vertex AI Model Registry in your ZenML stack. This allows you to register models with detailed metadata and configuration and then deploy a specific version seamlessly. | ||
{% endhint %} | ||
|
||
## How to deploy it? | ||
|
||
The Vertex AI Model Deployer is enabled via the ZenML GCP integration. First, install the integration: | ||
|
||
```shell | ||
zenml integration install gcp -y | ||
``` | ||
|
||
### Authentication and Service Connector Configuration | ||
|
||
The deployer requires proper GCP authentication. The recommended approach is to use the ZenML Service Connector: | ||
|
||
```shell | ||
# Register the service connector with a service account key | ||
zenml service-connector register vertex_deployer_connector \ | ||
--type gcp \ | ||
--auth-method=service-account \ | ||
--project_id=<PROJECT_ID> \ | ||
[email protected] \ | ||
--resource-type gcp-generic | ||
|
||
# Register the model deployer and connect it to the service connector | ||
zenml model-deployer register vertex_deployer \ | ||
--flavor=vertex \ | ||
--location=us-central1 \ | ||
--connector vertex_deployer_connector | ||
``` | ||
|
||
{% hint style="info" %} | ||
The service account used for deployment must have the following permissions: | ||
- `Vertex AI User` to enable model deployments | ||
- `Vertex AI Service Agent` for model endpoint management | ||
- `Storage Object Viewer` if the model artifacts reside in Google Cloud Storage | ||
{% endhint %} | ||
|
||
## How to use it | ||
|
||
A complete usage example is available in the [ZenML Examples repository](https://github.com/zenml-io/zenml-projects/tree/main/vertex-registry-and-deployer). | ||
|
||
### Deploying a Model in a Pipeline | ||
|
||
Below is an example of a deployment step that uses the updated configuration options. In this example, the deployment configuration supports: | ||
|
||
- **Model versioning**: Explicitly provide the model version (using the full resource name from the model registry). | ||
- **Display name and Sync mode**: Fields such as `display_name` (for a friendly endpoint name) and `sync` (to wait for deployment completion) are now available. | ||
- **Traffic configuration**: Route a certain percentage (e.g., 100%) of traffic to this deployment. | ||
- **Advanced options**: You can still specify custom container settings, resource specifications (including GPU options), and explanation configuration via shared classes from `vertex_base_config.py`. | ||
|
||
```python | ||
from typing_extensions import Annotated | ||
from zenml import ArtifactConfig, get_step_context, step | ||
from zenml.client import Client | ||
from zenml.integrations.gcp.services.vertex_deployment import ( | ||
VertexDeploymentConfig, | ||
VertexDeploymentService, | ||
) | ||
|
||
@step(enable_cache=False) | ||
def model_deployer( | ||
model_registry_uri: str, | ||
is_promoted: bool = False, | ||
) -> Annotated[ | ||
VertexDeploymentService, | ||
ArtifactConfig(name="vertex_deployment", is_deployment_artifact=True), | ||
]: | ||
"""Model deployer step. | ||
|
||
Args: | ||
model_registry_uri: The full resource name of the model in the registry. | ||
is_promoted: Flag indicating if the model is promoted to production. | ||
|
||
Returns: | ||
The deployed model service. | ||
""" | ||
if not is_promoted: | ||
# Skip deployment if the model is not promoted. | ||
return None | ||
else: | ||
zenml_client = Client() | ||
current_model = get_step_context().model | ||
model_deployer = zenml_client.active_stack.model_deployer | ||
|
||
# Create deployment configuration with advanced options. | ||
vertex_deployment_config = VertexDeploymentConfig( | ||
location="europe-west1", | ||
name=current_model.name, # Unique endpoint name in Vertex AI. | ||
display_name="zenml-vertex-quickstart", | ||
model_name=model_registry_uri, # Fully qualified model name (from model registry). | ||
model_version=current_model.version, # Specify the model version explicitly. | ||
description="An example of deploying a model using the Vertex AI Model Deployer", | ||
sync=True, # Wait for deployment to complete before proceeding. | ||
traffic_percentage=100, # Route 100% of traffic to this model version. | ||
# (Optional) Advanced configurations: | ||
# container=VertexAIContainerSpec( | ||
# image_uri="your-custom-image:latest", | ||
# ports=[8080], | ||
# env={"ENV_VAR": "value"} | ||
# ), | ||
# resources=VertexAIResourceSpec( | ||
# accelerator_type="NVIDIA_TESLA_T4", | ||
# accelerator_count=1, | ||
# machine_type="n1-standard-4", | ||
# min_replica_count=1, | ||
# max_replica_count=3, | ||
# ), | ||
# explanation=VertexAIExplanationSpec( | ||
# metadata={"method": "integrated-gradients"}, | ||
# parameters={"num_integral_steps": 50} | ||
# ) | ||
) | ||
|
||
service = model_deployer.deploy_model( | ||
config=vertex_deployment_config, | ||
service_type=VertexDeploymentService.SERVICE_TYPE, | ||
) | ||
|
||
return service | ||
``` | ||
|
||
*Example: [`model_deployer.py`](../../examples/vertex-registry-and-deployer/steps/model_deployer.py)* | ||
|
||
### Configuration Options | ||
|
||
The Vertex AI Model Deployer leverages a comprehensive configuration system defined in the shared base configuration and deployer-specific settings: | ||
|
||
- **Basic Settings:** | ||
- `location`: The GCP region for deployment (e.g., "us-central1" or "europe-west1"). | ||
- `name`: Unique identifier for the deployed endpoint. | ||
- `display_name`: A human-friendly name for the endpoint. | ||
- `model_name`: The fully qualified model name from the model registry. | ||
- `model_version`: The version of the model to deploy. | ||
- `description`: A textual description of the deployment. | ||
- `sync`: A flag to indicate whether the deployment should wait until completion. | ||
- `traffic_percentage`: The percentage of incoming traffic to route to this deployment. | ||
|
||
- **Container and Resource Configuration:** | ||
- Configurations provided via [VertexAIContainerSpec](../../integrations/gcp/flavors/vertex_base_config.py) allow you to specify a custom serving container image, HTTP routes (`predict_route`, `health_route`), environment variables, and port exposure. | ||
- [VertexAIResourceSpec](../../integrations/gcp/flavors/vertex_base_config.py) lets you override the default machine type, number of replicas, and even GPU options. | ||
|
||
- **Advanced Settings:** | ||
- Service account, network configuration, and customer-managed encryption keys. | ||
- Model explanation settings via `VertexAIExplanationSpec` if you need integrated model interpretability. | ||
|
||
These options are defined across the [Vertex AI Base Config](../../integrations/gcp/flavors/vertex_base_config.py) and the deployer–specific configuration in [VertexModelDeployerFlavor](../../integrations/gcp/flavors/vertex_model_deployer_flavor.py). | ||
|
||
### Limitations and Considerations | ||
|
||
1. **Stack Requirements:** | ||
- It is recommended to pair the deployer with a Vertex AI Model Registry in your stack. | ||
- Compatible with both local and remote orchestrators. | ||
- Requires valid GCP credentials and permissions. | ||
|
||
2. **Authentication:** | ||
- Best practice is to use service connectors for secure and managed authentication. | ||
- Supports multiple authentication methods (service accounts, local credentials). | ||
|
||
3. **Costs:** | ||
- Vertex AI endpoints will incur costs based on machine type and uptime. | ||
- Utilize autoscaling (via configured `min_replica_count` and `max_replica_count`) to manage cost. | ||
|
||
4. **Region Consistency:** | ||
- Ensure that the model and deployment are created in the same GCP region. | ||
|
||
For more details, please refer to the [SDK docs](https://sdkdocs.zenml.io) and the relevant implementation files: | ||
- [`vertex_model_deployer.py`](../../integrations/gcp/model_deployers/vertex_model_deployer.py) | ||
- [`vertex_base_config.py`](../../integrations/gcp/flavors/vertex_base_config.py) | ||
- [`vertex_model_deployer_flavor.py`](../../integrations/gcp/flavors/vertex_model_deployer_flavor.py) |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,207 @@ | ||
# Vertex AI Model Registry | ||
|
||
[Vertex AI](https://cloud.google.com/vertex-ai) is Google Cloud's unified ML platform that helps you build, deploy, and scale ML models. The Vertex AI Model Registry is a centralized repository for managing your ML models throughout their lifecycle. With ZenML's Vertex AI Model Registry integration, you can register model versions—with extended configuration options—track metadata, and seamlessly deploy your models using Vertex AI's managed infrastructure. | ||
|
||
## When would you want to use it? | ||
|
||
You should consider using the Vertex AI Model Registry when: | ||
|
||
- You're already using Google Cloud Platform (GCP) and want to leverage its native ML infrastructure. | ||
- You need enterprise-grade model management with fine-grained access control. | ||
- You want to track model lineage and metadata in a centralized location. | ||
- You're building ML pipelines that integrate with other Vertex AI services. | ||
- You need to deploy models with custom configurations such as defined container images, resource specifications, and additional metadata. | ||
|
||
This registry is particularly useful in scenarios where you: | ||
- Build production ML pipelines that require deployment to Vertex AI endpoints. | ||
- Manage multiple versions of models across development, staging, and production. | ||
- Need to register model versions with detailed configuration for robust deployment. | ||
|
||
{% hint style="warning" %} | ||
**Important:** The Vertex AI Model Registry implementation only supports the model **version** interface—not the model interface. This means that you cannot directly register, update, or delete models; you only have operations for model versions. A model container is automatically created with the first version, and subsequent uploads with the same display name create new versions. | ||
{% endhint %} | ||
|
||
## How do you deploy it? | ||
|
||
The Vertex AI Model Registry flavor is enabled through the ZenML GCP integration. First, install the integration: | ||
|
||
```shell | ||
zenml integration install gcp -y | ||
``` | ||
|
||
### Authentication and Service Connector Configuration | ||
|
||
Vertex AI requires proper GCP authentication. The recommended configuration is via the ZenML Service Connector, which supports both service-account-based authentication and local gcloud credentials. | ||
|
||
1. **Using a GCP Service Connector with a service account (Recommended):** | ||
```shell | ||
# Register the service connector with a service account key | ||
zenml service-connector register vertex_registry_connector \ | ||
--type gcp \ | ||
--auth-method=service-account \ | ||
--project_id=<PROJECT_ID> \ | ||
[email protected] \ | ||
--resource-type gcp-generic | ||
|
||
# Register the model registry | ||
zenml model-registry register vertex_registry \ | ||
--flavor=vertex \ | ||
--location=us-central1 | ||
|
||
# Connect the model registry to the service connector | ||
zenml model-registry connect vertex_registry --connector vertex_registry_connector | ||
``` | ||
2. **Using local gcloud credentials:** | ||
```shell | ||
# Register the model registry using local gcloud auth | ||
zenml model-registry register vertex_registry \ | ||
--flavor=vertex \ | ||
--location=us-central1 | ||
``` | ||
|
||
{% hint style="info" %} | ||
The service account needs the following permissions: | ||
- `Vertex AI User` role for creating and managing model versions. | ||
- `Storage Object Viewer` role if accessing models stored in Google Cloud Storage. | ||
{% endhint %} | ||
|
||
## How do you use it? | ||
|
||
### Registering Models inside a Pipeline with Extended Configuration | ||
|
||
The Vertex AI Model Registry supports extended configuration options via the `VertexAIModelConfig` class (defined in the [vertex_base_config.py](../../integrations/gcp/flavors/vertex_base_config.py) file). This means you can specify additional details for your deployments such as: | ||
|
||
- **Container configuration**: Use the `VertexAIContainerSpec` to define a custom serving container (e.g., specifying the `image_uri`, `predict_route`, `health_route`, and exposed ports). | ||
- **Resource configuration**: Use the `VertexAIResourceSpec` to specify compute resources like `machine_type`, `min_replica_count`, and `max_replica_count`. | ||
- **Additional metadata and labels**: Annotate your model registrations with pipeline details, stage information, and custom labels. | ||
|
||
Below is an example of how you might register a model version in your ZenML pipeline: | ||
|
||
```python | ||
from typing_extensions import Annotated | ||
|
||
from zenml import ArtifactConfig, get_step_context, step | ||
from zenml.client import Client | ||
from zenml.integrations.gcp.flavors.vertex_base_config import ( | ||
VertexAIContainerSpec, | ||
VertexAIModelConfig, | ||
VertexAIResourceSpec, | ||
) | ||
from zenml.logger import get_logger | ||
from zenml.model_registries.base_model_registry import ( | ||
ModelRegistryModelMetadata, | ||
) | ||
|
||
logger = get_logger(__name__) | ||
|
||
|
||
@step(enable_cache=False) | ||
def model_register( | ||
is_promoted: bool = False, | ||
) -> Annotated[str, ArtifactConfig(name="model_registry_uri")]: | ||
"""Model registration step. | ||
|
||
Registers a model version in the Vertex AI Model Registry with extended configuration | ||
and returns the full resource name of the registered model. | ||
|
||
Extended configuration includes settings for container, resources, and metadata which can then be reused in | ||
subsequent model deployments. | ||
""" | ||
if is_promoted: | ||
# Get the current model from the step context | ||
current_model = get_step_context().model | ||
|
||
client = Client() | ||
model_registry = client.active_stack.model_registry | ||
# Create an extended model configuration using Vertex AI base settings | ||
model_config = VertexAIModelConfig( | ||
location="europe-west1", | ||
container=VertexAIContainerSpec( | ||
image_uri="europe-docker.pkg.dev/vertex-ai/prediction/sklearn-cpu.1-5:latest", | ||
predict_route="predict", | ||
health_route="health", | ||
ports=[8080], | ||
), | ||
resources=VertexAIResourceSpec( | ||
machine_type="n1-standard-4", | ||
min_replica_count=1, | ||
max_replica_count=1, | ||
), | ||
labels={"env": "production"}, | ||
description="Extended model configuration for Vertex AI", | ||
) | ||
|
||
# Register the model version with the extended configuration as metadata | ||
model_version = model_registry.register_model_version( | ||
name=current_model.name, | ||
version=str(current_model.version), | ||
model_source_uri=current_model.get_model_artifact("sklearn_classifier").uri, | ||
description="ZenML model version registered with extended configuration", | ||
metadata=ModelRegistryModelMetadata( | ||
zenml_pipeline_name=get_step_context().pipeline.name, | ||
zenml_pipeline_run_uuid=str(get_step_context().pipeline_run.id), | ||
zenml_step_name=get_step_context().step_run.name, | ||
), | ||
config=model_config, | ||
) | ||
Comment on lines
+135
to
+146
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This documentation code snippet looks exactly how I imagined the Vertex AI model registry would work: the However, you'll notice that the documentation is incorrect: there is no I think you should make this possible: allow |
||
logger.info(f"Model version {model_version.version} registered in Model Registry") | ||
|
||
# Return the full resource name of the registered model | ||
return model_version.registered_model.name | ||
else: | ||
return "" | ||
``` | ||
|
||
*Example: [`model_register.py`](../../examples/vertex-registry-and-deployer/steps/model_register.py)* | ||
|
||
### Working with Model Versions | ||
|
||
Since the Vertex AI Model Registry supports only version-level operations, here are some commands to manage model versions: | ||
|
||
```shell | ||
# List all model versions | ||
zenml model-registry models list-versions <model-name> | ||
|
||
# Get details of a specific model version | ||
zenml model-registry models get-version <model-name> -v <version> | ||
|
||
# Delete a model version | ||
zenml model-registry models delete-version <model-name> -v <version> | ||
``` | ||
|
||
### Configuration Options | ||
|
||
The Vertex AI Model Registry accepts several configuration options, now enriched with extended settings: | ||
|
||
- **location**: The GCP region where your resources will be created (e.g., "us-central1" or "europe-west1"). | ||
- **project_id**: (Optional) A GCP project ID override. | ||
- **credentials**: (Optional) GCP credentials configuration. | ||
- **container**: (Optional) Detailed container settings (defined via `VertexAIContainerSpec`) for the model's serving container such as: | ||
- `image_uri` | ||
- `predict_route` | ||
- `health_route` | ||
- `ports` | ||
- **resources**: (Optional) Compute resource settings (using `VertexAIResourceSpec`) like `machine_type`, `min_replica_count`, and `max_replica_count`. | ||
- **labels** and **metadata**: Additional annotation data for organizing and tracking your model versions. | ||
|
||
These configuration options are specified in the [Vertex AI Base Config](../../integrations/gcp/flavors/vertex_base_config.py) and further extended in the [Vertex AI Model Registry Flavor](../../integrations/gcp/flavors/vertex_model_registry_flavor.py). | ||
|
||
### Key Differences from Other Model Registries | ||
|
||
1. **Version-Only Interface**: Vertex AI only supports version-level operations for model registration. | ||
2. **Authentication**: Uses GCP service connectors and local credentials integrated via ZenML. | ||
3. **Extended Configuration**: Register model versions with detailed settings for container, resources, and metadata through `VertexAIModelConfig`. | ||
4. **Managed Service**: As a fully managed service, Vertex AI handles infrastructure management while you focus on your ML models. | ||
|
||
## Limitations | ||
|
||
- The methods `register_model()`, `update_model()`, and `delete_model()` are not implemented; you can only work with model versions. | ||
- It is recommended to specify a serving container image URI rather than rely on the default scikit-learn container to ensure compatibility with Vertex AI endpoints. | ||
- All models registered through this integration are automatically labeled with `managed_by="zenml"` for consistent tracking. | ||
|
||
For more detailed information, check out the [SDK docs](https://sdkdocs.zenml.io/latest/integration_code_docs/integrations-gcp/#zenml.integrations.gcp.model_registry). | ||
|
||
<figure> | ||
<img src="https://static.scarf.sh/a.png?x-pxid=f0b4f458-0a54-4fcd-aa95-d5ee424815bc" alt="ZenML Scarf"> | ||
<figcaption>ZenML in action</figcaption> | ||
</figure> |
Uh oh!
There was an error while loading. Please reload this page.