vertexai.ipynb getML on Vertex AI

Overview¶

GetML on Vertex AI: This tutorial demonstrates how to use the Vertex AI SDK and gcloud cli to build and deploy custom containers for training and prediction of getML models.

Dataset¶

The financial dataset from the CTU Prague Relational Learning Repository. It consists of multiple tables containing various features related to bank customers and their transaction histories. The target variable is whether a customer defaults on a loan.

Note: This notebook is based on Predicting the loan default risk of Czech bank customers using getML. Checkout it out first, if you want to know more about the dataset and getML in general.

Objective¶

The goal of this tutorial is to:

Train a getML model using relational data from multiple tables.
Save the trained model and its serialized pre-processor.
Build a custom getML serving container with custom prediction logic using the Custom Prediction Routine feature in the Vertex AI SDK.
Test the built container locally.
Upload and deploy the custom container to Vertex AI Predictions.

Note: This tutorial focuses more on deploying getML models with Vertex AI than on the design of the model itself.

Costs¶

This tutorial involves the use of billable components of Google Cloud:

Vertex AI
Google Cloud Storage
Google Container Registry

TIP: Check out Vertex AI pricing, and use the Pricing Calculator to generate a cost estimate based on your projected usage.

Before you begin¶

Note: If you are running this notebook on Vertex AI Workbench, your environment already meets most requirements. However, you need to add the storage.admin role to the *compute@developer.gserviceaccount.com service account that is assigned to this notebook by default. Please be aware of step VertexAI Workbench: Adding Role to Service Account

If you run this notebook locally, please consider the following requirements:

Set up Your Local Development Environment¶

If you run this notebook on your local machine make sure your environment meets this notebook's requirements:

Note: If you need to install Docker or the SDK, the links will guide you to the installation steps.

Set up Your Google Cloud Project¶

The following steps are required, regardless of your notebook environment.

Select or create a Google Cloud project.

IMPORTANT! If you have not used gcloud CLI before you need to set it up first. On your local shell, run:

gcloud init

During the process you will authenticate, get credentials and can set your default project / region.

Make sure that billing is enabled for your project.

Note: All commands prefixed with ! are shell commands. The prefix ! allows for direct execution within Juypter. However, you can also execute them in a dedicated Terminal.

Determine Environment¶

We need to adapt to the environment this notebook runs in. So if this notebook runs on VertexAI Workbench or Colab IS_GCLOUD_ENV is True

In [ ]:

  Copied!     
 
import os

IS_WORKBENCH_ENV = "GOOGLE_VM_CONFIG_LOCK_FILE" in os.environ
import os IS_WORKBENCH_ENV = "GOOGLE_VM_CONFIG_LOCK_FILE" in os.environ

Install requirements¶

getml.vertexai is located within src. This package contains:

Utility functions for accessing GCP resources
Configurations for this notebook and training/inference containers we will create later.
Dependencies needed for notebook and docker containers:
- getml==1.4.0
- google-cloud-aiplatform[prediction]==1.56.0
- pyyaml==6.0.1

The Python Cloud Client Library google-cloud-aiplatform is needed to interact with services from Google Cloud, including

Vertex AI
Cloud Storage.
[prediction] option includes FastAPI, that is needed for building the prediction container later on.

For more information on getML, checkout the documentation

Install `getml.vertexai`¶

In the Vertex AI Workbench environment, perform the following steps:

Download the tarball version of the getml-demo repository.
Extract the content of the project folder into the current working directory.

In [ ]:

  Copied!     
 
# type: ignore

if IS_WORKBENCH_ENV:
    # stip-components=1 is necessary to avoid creating a directory with the name of the repository
    ! curl -L https://api.github.com/repos/getml/getml-demo/tarball/vertexai | tar --strip-components=1 -xz
! uv pip install --force-reinstall "."
# type: ignore if IS_WORKBENCH_ENV: # stip-components=1 is necessary to avoid creating a directory with the name of the repository ! curl -L https://api.github.com/repos/getml/getml-demo/tarball/vertexai | tar --strip-components=1 -xz ! uv pip install --force-reinstall "."

Kernel restart¶

On Workbench we also need to restart the kernel to apply all changes.

In [ ]:

  Copied!     
 
# type: ignore

if IS_WORKBENCH_ENV:
    import IPython

    app = IPython.Application.instance()
    app.kernel.do_shutdown(True)
# type: ignore if IS_WORKBENCH_ENV: import IPython app = IPython.Application.instance() app.kernel.do_shutdown(True)

⚠️ On Workbench: The kernel is going to restart. Wait until it's finished before continuing to the next step. ⚠️

Redefine IS_WORKBENCH_ENV after kernel restart

In [ ]:

  Copied!     
 
import os

IS_WORKBENCH_ENV = "GOOGLE_VM_CONFIG_LOCK_FILE" in os.environ
import os IS_WORKBENCH_ENV = "GOOGLE_VM_CONFIG_LOCK_FILE" in os.environ

Import Vertex AI SDK¶

aiplatform is part of the google-cloud-aiplatform package. It provides a Python API for interacting with Vertex AI services.

In [ ]:

  Copied!     
 
from google.cloud import aiplatform
from google.cloud import aiplatform

Configuration¶

Define and save configuration variables using Config, storing them in config.yaml.

This method centralizes project variable definitions within the notebook and ensures availability to Docker containers created later.

Access the configuration via the cfg instance using dot notation, e.g., cfg.REGION.

Note: If you are using Vertex AI Workbench, this notebook is associated with the Compute Engine default service account. SERVICE_ACCOUNT_NAME will be automatically filled with the corresponding *compute@developer.gserviceaccount.com name.

In [ ]:

  Copied!     
 
from getml.vertexai.config import Config

cfg = Config(
    {
        "GCP_PROJECT_NAME": "",  # NOTE: Must be globally(!) unique on GCP
        "BUCKET_NAME": "",  # NOTE: Must be globally(!) unique on GCP
        "BUCKET_DIR_MODEL": "model_artifact",
        "BUCKET_DIR_DATASET": "datasets",
        "REGION": "europe-west1",  # NOTE: Adapt to your preferred region
        "SERVICE_ACCOUNT_NAME": "getml-vertexai-sa",  # NOTE: Gets replaced, if you run on Vertex AI Workbench
        "DOCKER_REPOSITORY": "getml-vertexai-docker-repository",
        "GETML_PROJECT_NAME": "Loans",
    }
)

# Save configuration for later use in Docker containers
cfg.save("config.yaml")
from getml.vertexai.config import Config cfg = Config( { "GCP_PROJECT_NAME": "", # NOTE: Must be globally(!) unique on GCP "BUCKET_NAME": "", # NOTE: Must be globally(!) unique on GCP "BUCKET_DIR_MODEL": "model_artifact", "BUCKET_DIR_DATASET": "datasets", "REGION": "europe-west1", # NOTE: Adapt to your preferred region "SERVICE_ACCOUNT_NAME": "getml-vertexai-sa", # NOTE: Gets replaced, if you run on Vertex AI Workbench "DOCKER_REPOSITORY": "getml-vertexai-docker-repository", "GETML_PROJECT_NAME": "Loans", } ) # Save configuration for later use in Docker containers cfg.save("config.yaml")

Print all available configurations

In [ ]:

  Copied!     
 
cfg
cfg

Set the project and region to ensure ! gcloud commands are executed accordingly.

In [ ]:

  Copied!     
 
! gcloud config set project {cfg.GCP_PROJECT_NAME}
! gcloud config set ai/region {cfg.REGION}
! gcloud config set project {cfg.GCP_PROJECT_NAME} ! gcloud config set ai/region {cfg.REGION}

Initialize the Vertex AI SDK and set the project and location defaults there as well. This ensures all aiplatform related commands/functions execute on the correct project and region.

In [ ]:

  Copied!     
 
aiplatform.init(project=cfg.GCP_PROJECT_NAME, location=cfg.REGION)
aiplatform.init(project=cfg.GCP_PROJECT_NAME, location=cfg.REGION)

Enable Necessary APIs¶

If you have just created a new project, some APIs might not be enabled yet. Use the following command to enable all the APIs needed for this tutorial:

In [ ]:

  Copied!     
 
! gcloud services enable \
    iam.googleapis.com \
    compute.googleapis.com \
    containerregistry.googleapis.com \
    aiplatform.googleapis.com
! gcloud services enable \ iam.googleapis.com \ compute.googleapis.com \ containerregistry.googleapis.com \ aiplatform.googleapis.com

Setup Service Account¶

We need a service account to provide our containers appropriate permissions to access

Storage Buckets (Save and load model artifacts)
MetadataStore (Logging metrics/Experiments)

VertexAI Workbench: Adding Role to Service Account¶

If you are running this notebook on Vertex AI Workbench, it is associated with a Service Account (see SERVICE_ACCOUNT_EMAIL). To ensure proper functionality, you need to add the storage.admin role to this account.

Perform this step on the GCP Platform. Follow the link below (it should automatically open) and add the storage.admin role to the Service Account associated with this notebook.

In [ ]:

  Copied!     
 
from getml.vertexai import open_iam_permissions

if IS_WORKBENCH_ENV:
    open_iam_permissions(cfg.GCP_PROJECT_NAME)

    cfg.print(["SERVICE_ACCOUNT_EMAIL"])
    cfg.print_links(["iam_permissions"])
from getml.vertexai import open_iam_permissions if IS_WORKBENCH_ENV: open_iam_permissions(cfg.GCP_PROJECT_NAME) cfg.print(["SERVICE_ACCOUNT_EMAIL"]) cfg.print_links(["iam_permissions"])

Local Environment: Create a Service Account¶

If this notebook runs on a local environment and you are authenticated to gcloud cli with your personal account, we need to create a service account.

NOTE: If you run this notebook on VertexAI Workbench skip this step and continue with Save Service Account to JSON

In [ ]:

  Copied!     
 
# NOTE: If the service account already exists in the project, the following error can be ignored:
# ERROR: (gcloud.iam.service-accounts.create) Resource in projects [$PROJECT_ID] is the subject of a conflict..

if not IS_WORKBENCH_ENV:
    cfg.print(["SERVICE_ACCOUNT_NAME"])

    ! gcloud iam service-accounts create {cfg.SERVICE_ACCOUNT_NAME} \
        --display-name="getML Vertex AI Service Account"
# NOTE: If the service account already exists in the project, the following error can be ignored: # ERROR: (gcloud.iam.service-accounts.create) Resource in projects [$PROJECT_ID] is the subject of a conflict.. if not IS_WORKBENCH_ENV: cfg.print(["SERVICE_ACCOUNT_NAME"]) ! gcloud iam service-accounts create {cfg.SERVICE_ACCOUNT_NAME} \ --display-name="getML Vertex AI Service Account"

Set Permissions on Service Account¶

Once the service account is created, we need to grant the roles aiplatform.user and storage.admin to it:

In [ ]:

  Copied!     
 
if not IS_WORKBENCH_ENV:
    cfg.print(["GCP_PROJECT_NAME", "SERVICE_ACCOUNT_EMAIL"])

    # Assign the Vertex AI User role
    ! gcloud projects add-iam-policy-binding {cfg.GCP_PROJECT_NAME} \
        --member="serviceAccount:{cfg.SERVICE_ACCOUNT_EMAIL}" \
        --role="roles/aiplatform.user"

    # Assign the Storage Admin role
    ! gcloud projects add-iam-policy-binding {cfg.GCP_PROJECT_NAME} \
        --member="serviceAccount:{cfg.SERVICE_ACCOUNT_EMAIL}" \
        --role="roles/storage.admin"
if not IS_WORKBENCH_ENV: cfg.print(["GCP_PROJECT_NAME", "SERVICE_ACCOUNT_EMAIL"]) # Assign the Vertex AI User role ! gcloud projects add-iam-policy-binding {cfg.GCP_PROJECT_NAME} \ --member="serviceAccount:{cfg.SERVICE_ACCOUNT_EMAIL}" \ --role="roles/aiplatform.user" # Assign the Storage Admin role ! gcloud projects add-iam-policy-binding {cfg.GCP_PROJECT_NAME} \ --member="serviceAccount:{cfg.SERVICE_ACCOUNT_EMAIL}" \ --role="roles/storage.admin"

Save Service Account to JSON¶

We will need the service_account.json file later when we create a local endpoint to test our container.

NOTE: If too many keys have been created, the following error can occur:

ERROR: (gcloud.iam.service-accounts.keys.create) FAILED_PRECONDITION: Precondition check failed.

In this case older keys should be deleted before creating a new one.

To prevent this from happening in the first place, we check if a service_account.json is already present before we create it.

In [ ]:

  Copied!     
 
# type: ignore

from pathlib import Path

cfg.print(["SERVICE_ACCOUNT_EMAIL"])

PATH_SERVICE_ACCOUNT_CREDENTIALS = Path("service_account.json")

if not PATH_SERVICE_ACCOUNT_CREDENTIALS.exists():
    ! gcloud iam service-accounts keys create {PATH_SERVICE_ACCOUNT_CREDENTIALS.name} \
        --iam-account={cfg.SERVICE_ACCOUNT_EMAIL}
# type: ignore from pathlib import Path cfg.print(["SERVICE_ACCOUNT_EMAIL"]) PATH_SERVICE_ACCOUNT_CREDENTIALS = Path("service_account.json") if not PATH_SERVICE_ACCOUNT_CREDENTIALS.exists(): ! gcloud iam service-accounts keys create {PATH_SERVICE_ACCOUNT_CREDENTIALS.name} \ --iam-account={cfg.SERVICE_ACCOUNT_EMAIL}

Create Cloud Storage Bucket¶

The bucket will serve as cloud storage for:

Trained model artifacts (The result of the training container)
Datasets (Loans dataset)

Both are included in the getML project dump, Loans.getml, which will be stored in the bucket we create now:

In [ ]:

  Copied!     
 
# NOTE: If BUCKET_URI already exists. The following error can be ignored:
# "ServiceException 409 A Cloud Storage bucket named $BUCKET_NAME already exists."

# Create the bucket
! gsutil mb -l {cfg.REGION} -p {cfg.GCP_PROJECT_NAME} {cfg.BUCKET_URI}

cfg.print(["BUCKET_URI"])
cfg.print_links(["bucket"])
# NOTE: If BUCKET_URI already exists. The following error can be ignored: # "ServiceException 409 A Cloud Storage bucket named $BUCKET_NAME already exists." # Create the bucket ! gsutil mb -l {cfg.REGION} -p {cfg.GCP_PROJECT_NAME} {cfg.BUCKET_URI} cfg.print(["BUCKET_URI"]) cfg.print_links(["bucket"])

Create Docker Repository on Artifact Registry¶

The Docker repository on Google Cloud's Artifact Registry will store the Docker images required for our training and prediction containers. These images will be built locally and then pushed to this repository for deployment on Vertex AI.

In [ ]:

  Copied!     
 
# NOTE: If DOCKER_REPOSITORY already exists. The following error can be ignored:
# ERROR: (gcloud.artifacts.repositories.create) ALREADY_EXISTS: the repository already exists

! gcloud artifacts repositories create {cfg.DOCKER_REPOSITORY} \
    --repository-format=docker \
    --location={cfg.REGION} \
    --description="Docker repository for getML Vertex AI Images"

cfg.print(["DOCKER_REPOSITORY", "REGION"])
cfg.print_links(["docker_repository"])
# NOTE: If DOCKER_REPOSITORY already exists. The following error can be ignored: # ERROR: (gcloud.artifacts.repositories.create) ALREADY_EXISTS: the repository already exists ! gcloud artifacts repositories create {cfg.DOCKER_REPOSITORY} \ --repository-format=docker \ --location={cfg.REGION} \ --description="Docker repository for getML Vertex AI Images" cfg.print(["DOCKER_REPOSITORY", "REGION"]) cfg.print_links(["docker_repository"])

Configure Docker¶

To be able to upload images to the repository, you need to update your Docker settings:

In [ ]:

  Copied!     
 
! gcloud auth configure-docker --quiet
! gcloud auth configure-docker --quiet {cfg.REGION}-docker.pkg.dev
! gcloud auth configure-docker --quiet ! gcloud auth configure-docker --quiet {cfg.REGION}-docker.pkg.dev

Set the DOCKER_HOST environment variable to the current docker daemon path.

This is necessary for compatibility of rootless Docker setups in combination with Vertex AI SDK.

In [ ]:

  Copied!     
 
from getml.vertexai.utils import get_docker_daemon_path

os.environ["DOCKER_HOST"] = get_docker_daemon_path()
from getml.vertexai.utils import get_docker_daemon_path os.environ["DOCKER_HOST"] = get_docker_daemon_path()

Handling "line buffering" Warnings¶

In this notebook, you may see warnings related to line buffering when using the subprocess module. These warnings do not impact the accuracy or performance and cannot be resolved within this notebook's context. Therefore, we will ignore them to keep our output clean.

Note: You might still see line buffering warnings when running ! gcloud commands. As stated, these can be safely ignored.

In [ ]:

  Copied!     
 
import warnings

warnings.filterwarnings("ignore", message="line buffering")
import warnings warnings.filterwarnings("ignore", message="line buffering")

Setup Finished¶

We have completed all setup and configuration steps and are now ready to start training our model.

Training¶

This notebook demonstrates the training of a binary classification model. It is based on the Loans notebook. Check out the link for more details on the dataset and usage of the getML Python API.

Main Objectives¶

The main objectives of the training container are to:

Get and preprocess the Loans dataset.
Train a getML model (pipeline) on the trainset.
Score the trained model on the testset.
Save the project (including data and model) as an artifact on the GCS Bucket.

Create Managed Dataset¶

To use experiments, a managed dataset is essential as it creates a default MetadataStore. The Experiments/MetadataStore is crucial for logging and tracking experiments, ensuring all data-related activities are properly recorded and managed within the Vertex AI ecosystem.

The managed dataset created here is primarily for demonstration purposes and to establish a MetadataStore. The actual data used to train our model is retrieved within the training Docker container. For details, see training/train.py.

In [ ]:

  Copied!     
 
from getml.vertexai import create_vertex_dataset_tabular

dataset_loans = create_vertex_dataset_tabular(
    cfg=cfg, filename_csv="datasets/loans_population_test.csv"
)
from getml.vertexai import create_vertex_dataset_tabular dataset_loans = create_vertex_dataset_tabular( cfg=cfg, filename_csv="datasets/loans_population_test.csv" )

Build Docker Container for Training¶

For training we just need a simple Docker container that includes:

Python runtime (we conveniently use a public python image as base layer)
Python dependencies:
- getml
- getml-playbooks
- google-cloud-aiplatform

In [ ]:

  Copied!     
 
print("Content of Dockerfile.train:\n")
%cat training/Dockerfile.train
print("Content of Dockerfile.train:\n") %cat training/Dockerfile.train

Now let's

build the Dockerfile.train image
and push it to the Artifact Registry

In [ ]:

  Copied!     
 
cfg.print(["DOCKER_IMAGE_URI_TRAIN"])

! docker build -f training/Dockerfile.train -t {cfg.DOCKER_IMAGE_URI_TRAIN} .
! docker push {cfg.DOCKER_IMAGE_URI_TRAIN}
cfg.print(["DOCKER_IMAGE_URI_TRAIN"]) ! docker build -f training/Dockerfile.train -t {cfg.DOCKER_IMAGE_URI_TRAIN} . ! docker push {cfg.DOCKER_IMAGE_URI_TRAIN}

Deploy Training Job¶

The gcloud ai custom-jobs create command

wraps the train.py script into our Training Docker Container, then
runs it in the Vertex AI environment on Google Cloud.
Finally, the result is an Artifact containing the getML model and dataframes

For more details about the command, checkout https://cloud.google.com/sdk/gcloud/reference/ai/custom-jobs/create

In [ ]:

  Copied!     
 
cfg.print(
    [
        "GETML_PROJECT_NAME",
        "GCP_PROJECT_NAME",
        "REGION",
        "SERVICE_ACCOUNT_EMAIL",
        "DOCKER_IMAGE_URI_TRAIN",
        "BUCKET_URI_DATASET",
    ]
)
cfg.print( [ "GETML_PROJECT_NAME", "GCP_PROJECT_NAME", "REGION", "SERVICE_ACCOUNT_EMAIL", "DOCKER_IMAGE_URI_TRAIN", "BUCKET_URI_DATASET", ] )

In [ ]:

  Copied!     
 
# Define variables for the training job
TRAIN_DISPLAY_NAME = f"getml-train-{cfg.GETML_PROJECT_NAME}"
TRAIN_LOCAL_PACKAGE_PATH = "training"
TRAIN_SCRIPT = "train.py"
TRAIN_MACHINE_TYPE = "n1-standard-4"
TRAIN_REPLICA_COUNT = 1

# Create and run the custom training job
! gcloud ai custom-jobs create \
  --project={cfg.GCP_PROJECT_NAME} \
  --region={cfg.REGION} \
  --display-name={TRAIN_DISPLAY_NAME} \
  --service-account={cfg.SERVICE_ACCOUNT_EMAIL} \
  --worker-pool-spec=machine-type={TRAIN_MACHINE_TYPE},replica-count={TRAIN_REPLICA_COUNT},executor-image-uri={cfg.DOCKER_IMAGE_URI_TRAIN},local-package-path={TRAIN_LOCAL_PACKAGE_PATH},script={TRAIN_SCRIPT}
# Define variables for the training job TRAIN_DISPLAY_NAME = f"getml-train-{cfg.GETML_PROJECT_NAME}" TRAIN_LOCAL_PACKAGE_PATH = "training" TRAIN_SCRIPT = "train.py" TRAIN_MACHINE_TYPE = "n1-standard-4" TRAIN_REPLICA_COUNT = 1 # Create and run the custom training job ! gcloud ai custom-jobs create \ --project={cfg.GCP_PROJECT_NAME} \ --region={cfg.REGION} \ --display-name={TRAIN_DISPLAY_NAME} \ --service-account={cfg.SERVICE_ACCOUNT_EMAIL} \ --worker-pool-spec=machine-type={TRAIN_MACHINE_TYPE},replica-count={TRAIN_REPLICA_COUNT},executor-image-uri={cfg.DOCKER_IMAGE_URI_TRAIN},local-package-path={TRAIN_LOCAL_PACKAGE_PATH},script={TRAIN_SCRIPT}

Result of Training Container¶

The following links contain the resources we just created, as well as the resulting artifact from the training container:

In [ ]:

  Copied!     
 
cfg.print_links(["training_jobs", "model_artifact", "experiments"])
cfg.print_links(["training_jobs", "model_artifact", "experiments"])

Prediction / Inference¶

Now that we have a trained Model Artifact stored on GCS, we can

build a prediction routine that loads the Artifact, and
deploy an HTTP endpoint to run predictions on our model.

Details of the prediction container¶

Basically, the container provides the HTTP route predict via FastAPI / Uvicorn, Gunicorn

To know more about the Predictor class, see the documentation on custom prediction routines.

All relevant files you can find within the prediction folder.

In [ ]:

  Copied!     
 
print("Content of Dockerfile.pred:\n")
%cat prediction/Dockerfile.pred
print("Content of Dockerfile.pred:\n") %cat prediction/Dockerfile.pred

Now let's build the Dockerfile.pred image

In [ ]:

  Copied!     
 
! docker build -f prediction/Dockerfile.pred \
    -t {cfg.DOCKER_IMAGE_URI_PRED} .
! docker build -f prediction/Dockerfile.pred \ -t {cfg.DOCKER_IMAGE_URI_PRED} .

Deploy Local Model¶

Before deploying the model to the cloud, it is advisable to build and test it locally. Once the model is confirmed to be functioning correctly, you can then proceed with the cloud deployment.

See the Google documentation for more details of the LocalModel class.

In [ ]:

  Copied!     
 
from google.cloud.aiplatform.prediction import LocalModel

local_model = LocalModel(serving_container_image_uri=cfg.DOCKER_IMAGE_URI_PRED)

cfg.print(["DOCKER_IMAGE_URI_PRED"])
from google.cloud.aiplatform.prediction import LocalModel local_model = LocalModel(serving_container_image_uri=cfg.DOCKER_IMAGE_URI_PRED) cfg.print(["DOCKER_IMAGE_URI_PRED"])

Local Prediction on Test Data¶

To run a prediction using the local model, we will send a request with test data in JSON format (as string) to the local endpoint.

We have prepared some test request data in JSON format, which can be loaded using load_json_from_file().

Note: Refer to [OPTIONAL] Create Test Request Data for details on how this test data was created.

In [ ]:

  Copied!     
 
from getml.vertexai import load_json_from_file

request_json = load_json_from_file("./prediction/request_test.json")
request_json
from getml.vertexai import load_json_from_file request_json = load_json_from_file("./prediction/request_test.json") request_json

[OPTIONAL] Create Test Request Data¶

If you would like to recreate the test data JSON or see how it is generated, uncomment the following code and check its source in src/getml/vertexai/request_data.py.

In [ ]:

  Copied!     
 
# from getml.vertexai import create_test_request

# create_test_request()
# from getml.vertexai import create_test_request # create_test_request()

Deploy `local_model` to a `local_endpoint`¶

Now that we have the prediction container ready, as well as some test data, we can deploy a local endpoint and send test data to the predict endpoint.

See the Google documentation for more details and requirements of the LocalModel class and its deploy_to_local_endpoint method.

⚠️ The model training and artifact creation must be completed before proceeding to the next step ⚠️

Verify that the training has successfully finished by checking the following links:

In [ ]:

  Copied!     
 
cfg.print_links(["training_jobs", "model_artifact"])
cfg.print_links(["training_jobs", "model_artifact"])

Waiting for the training job to finish before proceeding to the next step.

NOTE: This may take a few minutes.

In [ ]:

  Copied!     
 
from getml.vertexai.utils_gcp import wait_for_training_artifact

wait_for_training_artifact(cfg)
from getml.vertexai.utils_gcp import wait_for_training_artifact wait_for_training_artifact(cfg)

In [ ]:

  Copied!     
 
with local_model.deploy_to_local_endpoint(
    credential_path=PATH_SERVICE_ACCOUNT_CREDENTIALS.name,
    artifact_uri=cfg.ARTIFACT_URI,
) as local_endpoint:
    health_check_response = local_endpoint.run_health_check()
    print(
        "Health check response:", health_check_response, health_check_response.content
    )

    # Make a prediction
    predict_response = local_endpoint.predict(
        request=request_json,
        headers={"Content-Type": "application/json"},
    )
    print("Predict response:", predict_response, predict_response.content)
with local_model.deploy_to_local_endpoint( credential_path=PATH_SERVICE_ACCOUNT_CREDENTIALS.name, artifact_uri=cfg.ARTIFACT_URI, ) as local_endpoint: health_check_response = local_endpoint.run_health_check() print( "Health check response:", health_check_response, health_check_response.content ) # Make a prediction predict_response = local_endpoint.predict( request=request_json, headers={"Content-Type": "application/json"}, ) print("Predict response:", predict_response, predict_response.content)

You should see an output similar to:

Health check response: <Response [200]> b'{}'
Predict response: <Response [200]> b'{"predictions": [[0.9659892320632935], [0.8711856007575989], [0.882280170917511],...

If there is an issue you can check the logs of the container build process:

In [ ]:

  Copied!     
 
local_endpoint.container.logs().decode("utf-8").strip().split("\n")
local_endpoint.container.logs().decode("utf-8").strip().split("\n")

Manually Spin-Up Container and Call Endpoint with Test Data¶

Alternatively, you can manually run your Docker container. This way, you have more control over the parameters of docker run, especially the Google environment variables.

See more details about them in the Google documentation.

NOTE: You should run the docker run command in a separate Terminal, not in this notebook.

In [ ]:

  Copied!     
 
from getml.vertexai import cmd_to_run_local_endpoint

cmd_to_run_local_endpoint(cfg)
from getml.vertexai import cmd_to_run_local_endpoint cmd_to_run_local_endpoint(cfg)

Push Image to GCP / Vertex AI¶

Before we can deploy the container to the cloud, we need to push the image to the Artifact Registry.

Rebuild Prediction Container¶

To ensure compatibility with GCP (x86_64), the container image must be built with the correct architecture. Regardless of your current platform, the docker build command will now enforce the linux/amd64 platform.

In [ ]:

  Copied!     
 
! docker build --platform linux/amd64 -f prediction/Dockerfile.pred \
    -t {cfg.DOCKER_IMAGE_URI_PRED} .
! docker build --platform linux/amd64 -f prediction/Dockerfile.pred \ -t {cfg.DOCKER_IMAGE_URI_PRED} .

In [ ]:

  Copied!     
 
local_model.push_image()

cfg.print_links(["image_for_predictions"])
local_model.push_image() cfg.print_links(["image_for_predictions"])

Upload to Model Registry¶

The Model Registry serves as a centralized repository where you can manage and version your machine learning models. By uploading the model, you make it accessible for deployment and further analysis.

In [ ]:

  Copied!     
 
cfg.print(["GCP_PROJECT_NAME", "REGION", "ARTIFACT_URI"])

model = aiplatform.Model.upload(
    project=cfg.GCP_PROJECT_NAME,
    location=cfg.REGION,
    local_model=local_model,
    display_name="getML model (Loans)",
    artifact_uri=f"{cfg.ARTIFACT_URI}",
    description="getML model trained on the Loans dataset. Generated by demo_binary_classification.ipynb",
)

cfg.print_links(["model_registry"])
cfg.print(["GCP_PROJECT_NAME", "REGION", "ARTIFACT_URI"]) model = aiplatform.Model.upload( project=cfg.GCP_PROJECT_NAME, location=cfg.REGION, local_model=local_model, display_name="getML model (Loans)", artifact_uri=f"{cfg.ARTIFACT_URI}", description="getML model trained on the Loans dataset. Generated by demo_binary_classification.ipynb", ) cfg.print_links(["model_registry"])

Online Prediction Endpoint¶

Endpoints are machine learning models made available for online prediction requests. Endpoints are useful for timely predictions from many users (for example, in response to an application request). You can also request batch predictions if you don't need immediate results.

Deploy Endpoint¶

NOTE: If you encounter a "FailedPrecondition" error, this is very likely related to an exception thrown within the docker container. You should checkout the logs of the container to find the cause.

NOTE: Deployment of endpoint can take a while (30min+)

In [ ]:

  Copied!     
 
ENDPOINT_MACHINE_TYPE = "n1-standard-4"

endpoint = model.deploy(
    machine_type=ENDPOINT_MACHINE_TYPE, service_account=cfg.SERVICE_ACCOUNT_EMAIL
)
ENDPOINT_MACHINE_TYPE = "n1-standard-4" endpoint = model.deploy( machine_type=ENDPOINT_MACHINE_TYPE, service_account=cfg.SERVICE_ACCOUNT_EMAIL )

Prediction on Deployed Endpoint¶

Once the endpoint is deployed, you can also make predictions using the Test your model feature in the Vertex AI console (see link below).

As JSON request you can use the content of the request_test.json file:

In [ ]:

  Copied!     
 
# model_id is just needed to build the link
model_id = Path(endpoint.gca_resource.deployed_models[0].model).name
cfg.print_links(["deployed_model"], model_id)

print("JSON request:", request_json)
# model_id is just needed to build the link model_id = Path(endpoint.gca_resource.deployed_models[0].model).name cfg.print_links(["deployed_model"], model_id) print("JSON request:", request_json)

In [ ]:

  Copied!     
 
# PROJECT_ID (int): The numerical project ID.
# ENDPOINT_ID (int): The numerical endpoint ID.
# Example URL format: https://europe-west1-aiplatform.googleapis.com/v1/projects/{PROJECT_ID}/locations/europe-west1/endpoints/{ENDPOINT_ID}:predict

! curl -X POST \
    -H "Authorization: Bearer $(gcloud auth print-access-token)" \
    -H "Content-Type: application/json" \
    "https://{cfg.REGION}-aiplatform.googleapis.com/v1/{endpoint.resource_name}:predict" \
    -d "@prediction/request_test.json"
# PROJECT_ID (int): The numerical project ID. # ENDPOINT_ID (int): The numerical endpoint ID. # Example URL format: https://europe-west1-aiplatform.googleapis.com/v1/projects/{PROJECT_ID}/locations/europe-west1/endpoints/{ENDPOINT_ID}:predict ! curl -X POST \ -H "Authorization: Bearer $(gcloud auth print-access-token)" \ -H "Content-Type: application/json" \ "https://{cfg.REGION}-aiplatform.googleapis.com/v1/{endpoint.resource_name}:predict" \ -d "@prediction/request_test.json"

The result should looks similar to:

{
  "predictions": [
    [
      0.96598923206329346
    ],
    [
      0.87118560075759888
    ],
    [
      0.882280170917511
    ],

    ...
    
    ],
  "deployedModelId": "5059851355955396608",
  "model": "projects/956851751872/locations/europe-west1/models/8409526724114513920",
  "modelDisplayName": "getML model (Loans)",
  "modelVersionId": "1"
}

Undeploy Endpoint¶

Remember to undeploy your cloud endpoints after testing to avoid unnecessary costs.

In [ ]:

  Copied!     
 
endpoint.undeploy_all()
endpoint.undeploy_all()

Conclusion¶

In this notebook, we walked through the complete workflow of training and deploying a machine learning model using Vertex AI. We began by setting up our environment, configuring necessary project variables, and initializing Vertex AI. We then trained a binary classification model using the getML framework, logged and tracked our experiments using the MetadataStore, and saved the model artifact to Google Cloud Storage.

Next, we built and tested a custom prediction routine locally before pushing our Docker image to the Artifact Registry. We deployed the trained model to the Vertex AI Model Registry and created an online prediction endpoint to serve real-time predictions. Additionally, we discussed how to manually manage the Docker container and perform batch predictions.

By following these steps, you have learned how to leverage Vertex AI for end-to-end machine learning workflows, from data preprocessing and model training to deployment and prediction. This powerful combination of tools and services ensures a scalable, efficient, and well-managed approach to developing and deploying getML models on Google Cloud.

vertexai.ipynb getML on Vertex AI

Overview¶

Dataset¶

Objective¶

Costs¶

Before you begin¶

Set up Your Local Development Environment¶

Set up Your Google Cloud Project¶

Determine Environment¶

Install requirements¶

Install getml.vertexai¶

Kernel restart¶

Import Vertex AI SDK¶

Configuration¶

Enable Necessary APIs¶

Setup Service Account¶

VertexAI Workbench: Adding Role to Service Account¶

Local Environment: Create a Service Account¶

Set Permissions on Service Account¶

Save Service Account to JSON¶

Create Cloud Storage Bucket¶

Create Docker Repository on Artifact Registry¶

Configure Docker¶

Handling "line buffering" Warnings¶

Setup Finished¶

Training¶

Main Objectives¶

Create Managed Dataset¶

Build Docker Container for Training¶

Deploy Training Job¶

Result of Training Container¶

Prediction / Inference¶

Details of the prediction container¶

Deploy Local Model¶

Local Prediction on Test Data¶

[OPTIONAL] Create Test Request Data¶

Deploy local_model to a local_endpoint¶

Manually Spin-Up Container and Call Endpoint with Test Data¶

Push Image to GCP / Vertex AI¶

Rebuild Prediction Container¶

Upload to Model Registry¶

Online Prediction Endpoint¶

Deploy Endpoint¶

Prediction on Deployed Endpoint¶

Undeploy Endpoint¶

Conclusion¶

Install `getml.vertexai`¶

Deploy `local_model` to a `local_endpoint`¶