zenml-io · safoinme · Jun 3, 2024 · Jun 6, 2024 · Jul 14, 2024 · Jul 15, 2024
diff --git a/docs/book/component-guide/model-deployers/vertex.md b/docs/book/component-guide/model-deployers/vertex.md
@@ -0,0 +1,187 @@
+# Vertex AI Model Deployer
+
+[Vertex AI](https://cloud.google.com/vertex-ai) provides managed infrastructure for deploying machine learning models at scale. The Vertex AI Model Deployer in ZenML allows you to deploy models to Vertex AI endpoints, providing a scalable and fully managed solution for model serving.
+
+## When to use it?
+
+Use the Vertex AI Model Deployer when:
+
+- You are leveraging Google Cloud Platform (GCP) and wish to integrate with its native ML serving infrastructure.
+- You need enterprise-grade model serving capabilities complete with autoscaling and GPU acceleration.
+- You require a fully managed solution that abstracts away the operational overhead of serving models.
+- You need to deploy models directly from your Vertex AI Model Registry—or even from other registries or artifacts.
+- You want seamless integration with GCP services like Cloud Logging, IAM, and VPC.
+
+This deployer is especially useful for production deployments, high-availability serving, and dynamic scaling based on workloads.
+
+{% hint style="info" %}
+For best results, the Vertex AI Model Deployer works with a Vertex AI Model Registry in your ZenML stack. This allows you to register models with detailed metadata and configuration and then deploy a specific version seamlessly.
+{% endhint %}
+
+## How to deploy it?
+
+The Vertex AI Model Deployer is enabled via the ZenML GCP integration. First, install the integration:
+
+```shell
+zenml integration install gcp -y
+```
+
+### Authentication and Service Connector Configuration
+
+The deployer requires proper GCP authentication. The recommended approach is to use the ZenML Service Connector:
+
+```shell
+# Register the service connector with a service account key
+zenml service-connector register vertex_deployer_connector \
+    --type gcp \
+    --auth-method=service-account \
+    --project_id=<PROJECT_ID> \
+    [email protected] \
+    --resource-type gcp-generic
+
+# Register the model deployer and connect it to the service connector
+zenml model-deployer register vertex_deployer \
+    --flavor=vertex \
+    --location=us-central1 \
+    --connector vertex_deployer_connector
+```
+
+{% hint style="info" %}
+The service account used for deployment must have the following permissions:
+- `Vertex AI User` to enable model deployments
+- `Vertex AI Service Agent` for model endpoint management
+- `Storage Object Viewer` if the model artifacts reside in Google Cloud Storage
+{% endhint %}
+
+## How to use it
+
+A complete usage example is available in the [ZenML Examples repository](https://github.com/zenml-io/zenml-projects/tree/main/vertex-registry-and-deployer).
+
+### Deploying a Model in a Pipeline
+
+Below is an example of a deployment step that uses the updated configuration options. In this example, the deployment configuration supports:
+
+- **Model versioning**: Explicitly provide the model version (using the full resource name from the model registry).
+- **Display name and Sync mode**: Fields such as `display_name` (for a friendly endpoint name) and `sync` (to wait for deployment completion) are now available.
+- **Traffic configuration**: Route a certain percentage (e.g., 100%) of traffic to this deployment.
+- **Advanced options**: You can still specify custom container settings, resource specifications (including GPU options), and explanation configuration via shared classes from `vertex_base_config.py`.
+
+```python
+from typing_extensions import Annotated
+from zenml import ArtifactConfig, get_step_context, step
+from zenml.client import Client
+from zenml.integrations.gcp.services.vertex_deployment import (
+    VertexDeploymentConfig,
+    VertexDeploymentService,
+)
+
+@step(enable_cache=False)
+def model_deployer(
+    model_registry_uri: str,
+    is_promoted: bool = False,
+) -> Annotated[
+    VertexDeploymentService,
+    ArtifactConfig(name="vertex_deployment", is_deployment_artifact=True),
+]:
+    """Model deployer step.
+
+    Args:
+        model_registry_uri: The full resource name of the model in the registry.
+        is_promoted: Flag indicating if the model is promoted to production.
+
+    Returns:
+        The deployed model service.
+    """
+    if not is_promoted:
+        # Skip deployment if the model is not promoted.
+        return None
+    else:
+        zenml_client = Client()
+        current_model = get_step_context().model
+        model_deployer = zenml_client.active_stack.model_deployer
+
+        # Create deployment configuration with advanced options.
+        vertex_deployment_config = VertexDeploymentConfig(
+            location="europe-west1",
+            name=current_model.name,  # Unique endpoint name in Vertex AI.
+            display_name="zenml-vertex-quickstart",
+            model_name=model_registry_uri,  # Fully qualified model name (from model registry).
+            model_version=current_model.version,  # Specify the model version explicitly.
+            description="An example of deploying a model using the Vertex AI Model Deployer",
+            sync=True,  # Wait for deployment to complete before proceeding.
+            traffic_percentage=100,  # Route 100% of traffic to this model version.
+            # (Optional) Advanced configurations:
+            # container=VertexAIContainerSpec(
+            #     image_uri="your-custom-image:latest",
+            #     ports=[8080],
+            #     env={"ENV_VAR": "value"}
+            # ),
+            # resources=VertexAIResourceSpec(
+            #     accelerator_type="NVIDIA_TESLA_T4",
+            #     accelerator_count=1,
+            #     machine_type="n1-standard-4",
+            #     min_replica_count=1,
+            #     max_replica_count=3,
+            # ),
+            # explanation=VertexAIExplanationSpec(
+            #     metadata={"method": "integrated-gradients"},
+            #     parameters={"num_integral_steps": 50}
+            # )
+        )
+
+        service = model_deployer.deploy_model(
+            config=vertex_deployment_config,
+            service_type=VertexDeploymentService.SERVICE_TYPE,
+        )
+
+        return service
+```
+
+*Example: [`model_deployer.py`](../../examples/vertex-registry-and-deployer/steps/model_deployer.py)*
+
+### Configuration Options
+
+The Vertex AI Model Deployer leverages a comprehensive configuration system defined in the shared base configuration and deployer-specific settings:
+
+- **Basic Settings:**
+  - `location`: The GCP region for deployment (e.g., "us-central1" or "europe-west1").
+  - `name`: Unique identifier for the deployed endpoint.
+  - `display_name`: A human-friendly name for the endpoint.
+  - `model_name`: The fully qualified model name from the model registry.
+  - `model_version`: The version of the model to deploy.
+  - `description`: A textual description of the deployment.
+  - `sync`: A flag to indicate whether the deployment should wait until completion.
+  - `traffic_percentage`: The percentage of incoming traffic to route to this deployment.
+
+- **Container and Resource Configuration:**
+  - Configurations provided via [VertexAIContainerSpec](../../integrations/gcp/flavors/vertex_base_config.py) allow you to specify a custom serving container image, HTTP routes (`predict_route`, `health_route`), environment variables, and port exposure.
+  - [VertexAIResourceSpec](../../integrations/gcp/flavors/vertex_base_config.py) lets you override the default machine type, number of replicas, and even GPU options.
+
+- **Advanced Settings:**
+  - Service account, network configuration, and customer-managed encryption keys.
+  - Model explanation settings via `VertexAIExplanationSpec` if you need integrated model interpretability.
+
+These options are defined across the [Vertex AI Base Config](../../integrations/gcp/flavors/vertex_base_config.py) and the deployer–specific configuration in [VertexModelDeployerFlavor](../../integrations/gcp/flavors/vertex_model_deployer_flavor.py).
+
+### Limitations and Considerations
+
+1. **Stack Requirements:**
+   - It is recommended to pair the deployer with a Vertex AI Model Registry in your stack.
+   - Compatible with both local and remote orchestrators.
+   - Requires valid GCP credentials and permissions.
+
+2. **Authentication:**
+   - Best practice is to use service connectors for secure and managed authentication.
+   - Supports multiple authentication methods (service accounts, local credentials).
+
+3. **Costs:**
+   - Vertex AI endpoints will incur costs based on machine type and uptime.
+   - Utilize autoscaling (via configured `min_replica_count` and `max_replica_count`) to manage cost.
+
+4. **Region Consistency:**
+   - Ensure that the model and deployment are created in the same GCP region.
+
+For more details, please refer to the [SDK docs](https://sdkdocs.zenml.io) and the relevant implementation files:
+- [`vertex_model_deployer.py`](../../integrations/gcp/model_deployers/vertex_model_deployer.py)
+- [`vertex_base_config.py`](../../integrations/gcp/flavors/vertex_base_config.py)
+- [`vertex_model_deployer_flavor.py`](../../integrations/gcp/flavors/vertex_model_deployer_flavor.py)
diff --git a/docs/book/component-guide/model-registries/vertex.md b/docs/book/component-guide/model-registries/vertex.md
@@ -0,0 +1,207 @@
+# Vertex AI Model Registry
+
+[Vertex AI](https://cloud.google.com/vertex-ai) is Google Cloud's unified ML platform that helps you build, deploy, and scale ML models. The Vertex AI Model Registry is a centralized repository for managing your ML models throughout their lifecycle. With ZenML's Vertex AI Model Registry integration, you can register model versions—with extended configuration options—track metadata, and seamlessly deploy your models using Vertex AI's managed infrastructure.
+
+## When would you want to use it?
+
+You should consider using the Vertex AI Model Registry when:
+
+- You're already using Google Cloud Platform (GCP) and want to leverage its native ML infrastructure.
+- You need enterprise-grade model management with fine-grained access control.
+- You want to track model lineage and metadata in a centralized location.
+- You're building ML pipelines that integrate with other Vertex AI services.
+- You need to deploy models with custom configurations such as defined container images, resource specifications, and additional metadata.
+
+This registry is particularly useful in scenarios where you:
+- Build production ML pipelines that require deployment to Vertex AI endpoints.
+- Manage multiple versions of models across development, staging, and production.
+- Need to register model versions with detailed configuration for robust deployment.
+
+{% hint style="warning" %}
+**Important:** The Vertex AI Model Registry implementation only supports the model **version** interface—not the model interface. This means that you cannot directly register, update, or delete models; you only have operations for model versions. A model container is automatically created with the first version, and subsequent uploads with the same display name create new versions.
+{% endhint %}
+
+## How do you deploy it?
+
+The Vertex AI Model Registry flavor is enabled through the ZenML GCP integration. First, install the integration:
+
+```shell
+zenml integration install gcp -y
+```
+
+### Authentication and Service Connector Configuration
+
+Vertex AI requires proper GCP authentication. The recommended configuration is via the ZenML Service Connector, which supports both service-account-based authentication and local gcloud credentials.
+
+1. **Using a GCP Service Connector with a service account (Recommended):**
+    ```shell
+    # Register the service connector with a service account key
+    zenml service-connector register vertex_registry_connector \
+        --type gcp \
+        --auth-method=service-account \
+        --project_id=<PROJECT_ID> \
+        [email protected] \
+        --resource-type gcp-generic
+
+    # Register the model registry
+    zenml model-registry register vertex_registry \
+        --flavor=vertex \
+        --location=us-central1
+
+    # Connect the model registry to the service connector
+    zenml model-registry connect vertex_registry --connector vertex_registry_connector
+    ```
+2. **Using local gcloud credentials:**
+    ```shell
+    # Register the model registry using local gcloud auth
+    zenml model-registry register vertex_registry \
+        --flavor=vertex \
+        --location=us-central1
+    ```
+
+{% hint style="info" %}
+The service account needs the following permissions:
+- `Vertex AI User` role for creating and managing model versions.
+- `Storage Object Viewer` role if accessing models stored in Google Cloud Storage.
+{% endhint %}
+
+## How do you use it?
+
+### Registering Models inside a Pipeline with Extended Configuration
+
+The Vertex AI Model Registry supports extended configuration options via the `VertexAIModelConfig` class (defined in the [vertex_base_config.py](../../integrations/gcp/flavors/vertex_base_config.py) file). This means you can specify additional details for your deployments such as:
+
+- **Container configuration**: Use the `VertexAIContainerSpec` to define a custom serving container (e.g., specifying the `image_uri`, `predict_route`, `health_route`, and exposed ports).
+- **Resource configuration**: Use the `VertexAIResourceSpec` to specify compute resources like `machine_type`, `min_replica_count`, and `max_replica_count`.
+- **Additional metadata and labels**: Annotate your model registrations with pipeline details, stage information, and custom labels.
+
+Below is an example of how you might register a model version in your ZenML pipeline:
+
+```python
+from typing_extensions import Annotated
+
+from zenml import ArtifactConfig, get_step_context, step
+from zenml.client import Client
+from zenml.integrations.gcp.flavors.vertex_base_config import (
+    VertexAIContainerSpec,
+    VertexAIModelConfig,
+    VertexAIResourceSpec,
+)
+from zenml.logger import get_logger
+from zenml.model_registries.base_model_registry import (
+    ModelRegistryModelMetadata,
+)
+
+logger = get_logger(__name__)
+
+
+@step(enable_cache=False)
+def model_register(
+    is_promoted: bool = False,
+) -> Annotated[str, ArtifactConfig(name="model_registry_uri")]:
+    """Model registration step.
+
+    Registers a model version in the Vertex AI Model Registry with extended configuration
+    and returns the full resource name of the registered model.
+
+    Extended configuration includes settings for container, resources, and metadata which can then be reused in
+    subsequent model deployments.
+    """
+    if is_promoted:
+        # Get the current model from the step context
+        current_model = get_step_context().model
+
+        client = Client()
+        model_registry = client.active_stack.model_registry
+        # Create an extended model configuration using Vertex AI base settings
+        model_config = VertexAIModelConfig(
+            location="europe-west1",
+            container=VertexAIContainerSpec(
+                image_uri="europe-docker.pkg.dev/vertex-ai/prediction/sklearn-cpu.1-5:latest",
+                predict_route="predict",
+                health_route="health",
+                ports=[8080],
+            ),
+            resources=VertexAIResourceSpec(
+                machine_type="n1-standard-4",
+                min_replica_count=1,
+                max_replica_count=1,
+            ),
+            labels={"env": "production"},
+            description="Extended model configuration for Vertex AI",
+        )
+
+        # Register the model version with the extended configuration as metadata
+        model_version = model_registry.register_model_version(
+            name=current_model.name,
+            version=str(current_model.version),
+            model_source_uri=current_model.get_model_artifact("sklearn_classifier").uri,
+            description="ZenML model version registered with extended configuration",
+            metadata=ModelRegistryModelMetadata(
+                zenml_pipeline_name=get_step_context().pipeline.name,
+                zenml_pipeline_run_uuid=str(get_step_context().pipeline_run.id),
+                zenml_step_name=get_step_context().step_run.name,
+            ),
+            config=model_config,
+        )
+        logger.info(f"Model version {model_version.version} registered in Model Registry")
+
+        # Return the full resource name of the registered model
+        return model_version.registered_model.name
+    else:
+        return ""
+```
+
+*Example: [`model_register.py`](../../examples/vertex-registry-and-deployer/steps/model_register.py)*
+
+### Working with Model Versions
+
+Since the Vertex AI Model Registry supports only version-level operations, here are some commands to manage model versions:
+
+```shell
+# List all model versions
+zenml model-registry models list-versions <model-name>
+
+# Get details of a specific model version
+zenml model-registry models get-version <model-name> -v <version>
+
+# Delete a model version
+zenml model-registry models delete-version <model-name> -v <version>
+```
+
+### Configuration Options
+
+The Vertex AI Model Registry accepts several configuration options, now enriched with extended settings:
+
+- **location**: The GCP region where your resources will be created (e.g., "us-central1" or "europe-west1").
+- **project_id**: (Optional) A GCP project ID override.
+- **credentials**: (Optional) GCP credentials configuration.
+- **container**: (Optional) Detailed container settings (defined via `VertexAIContainerSpec`) for the model's serving container such as:
+  - `image_uri`
+  - `predict_route`
+  - `health_route`
+  - `ports`
+- **resources**: (Optional) Compute resource settings (using `VertexAIResourceSpec`) like `machine_type`, `min_replica_count`, and `max_replica_count`.
+- **labels** and **metadata**: Additional annotation data for organizing and tracking your model versions.
+
+These configuration options are specified in the [Vertex AI Base Config](../../integrations/gcp/flavors/vertex_base_config.py) and further extended in the [Vertex AI Model Registry Flavor](../../integrations/gcp/flavors/vertex_model_registry_flavor.py).
+
+### Key Differences from Other Model Registries
+
+1. **Version-Only Interface**: Vertex AI only supports version-level operations for model registration.
+2. **Authentication**: Uses GCP service connectors and local credentials integrated via ZenML.
+3. **Extended Configuration**: Register model versions with detailed settings for container, resources, and metadata through `VertexAIModelConfig`.
+4. **Managed Service**: As a fully managed service, Vertex AI handles infrastructure management while you focus on your ML models.
+
+## Limitations
+
+- The methods `register_model()`, `update_model()`, and `delete_model()` are not implemented; you can only work with model versions.
+- It is recommended to specify a serving container image URI rather than rely on the default scikit-learn container to ensure compatibility with Vertex AI endpoints.
+- All models registered through this integration are automatically labeled with `managed_by="zenml"` for consistent tracking.
+
+For more detailed information, check out the [SDK docs](https://sdkdocs.zenml.io/latest/integration_code_docs/integrations-gcp/#zenml.integrations.gcp.model_registry).
+
+<figure>
+  <img src="https://static.scarf.sh/a.png?x-pxid=f0b4f458-0a54-4fcd-aa95-d5ee424815bc" alt="ZenML Scarf">
+  <figcaption>ZenML in action</figcaption>
+</figure>