arrow_back

Using Kubeflow Pipelines with AI Platform

Sign in Join
Get access to 700+ labs and courses

Using Kubeflow Pipelines with AI Platform

Lab 2 hours universal_currency_alt 5 Credits show_chart Advanced
info This lab may incorporate AI tools to support your learning.
Get access to 700+ labs and courses

Overview

In this lab, learn how to install and use Kubeflow Pipelines to orchestrate various Google Cloud services in an end-to-end ML pipeline. After Kubeflow Pipelines is installed, you create an AI Platform Notebook and complete 2 example notebooks to demonstrate the services used and how to author a pipeline.

Learning objectives

In this lab, you will perform the following tasks:

  • Create a Kubernetes cluster and install Kubeflow Pipelines.
  • Launch an AI Platform Notebook.
  • Download example notebooks.
  • Create and run an end-to-end ML Pipeline using AI Platform and pre-built components.
  • Examine and verify the output of each step.
  • Test the online prediction of your finished model.
  • Convert a series of basic Python functions to pipeline components.
  • Assemble and execute a new pipeline with these Python functions.

Setup and requirements

Before you click the Start Lab button

Read these instructions. Labs are timed and you cannot pause them. The timer, which starts when you click Start Lab, shows how long Google Cloud resources will be made available to you.

This hands-on lab lets you do the lab activities yourself in a real cloud environment, not in a simulation or demo environment. It does so by giving you new, temporary credentials that you use to sign in and access Google Cloud for the duration of the lab.

What you need

To complete this lab, you need:

  • Access to a standard internet browser (Chrome browser recommended).
  • Time to complete the lab.
Note: If you have a personal Google Cloud account or project, do not use it for this lab. Note: If you are using a Pixelbook, open an Incognito window to run this lab.

Log in to Google Cloud Console

  1. Using the browser tab or window you are using for this lab session, copy the Username from the Connection Details panel and click the Open Google Console button.
Note: If you are asked to choose an account, click Use another account.
  1. Paste in the Username, and then the Password as prompted.
  2. Click Next.
  3. Accept the terms and conditions.

Since this is a temporary account, which will last only as long as this lab:

  • Do not add recovery options
  • Do not sign up for free trials
  1. Once the console opens, view the list of services by clicking the Navigation menu () at the top-left.

Task 1. Set up your environment

Enable the Vertex AI API

  1. Click to open the Vertex AI Dashboard.
  2. Click Enable Vertex AI API.

Task 2. Set up Kubeflow Pipelines

You will deploy Kubeflow Pipelines as a Kubernetes app, which is a solution that you can simply click to deploy to Google Kubernetes Engine. Kubernetes apps also have the flexibility to be deployed to Kubernetes clusters on-premises or in third-party clouds. You will see Kubeflow Pipelines integrated into your Google Cloud environment as AI Platform Pipelines. You can learn more about Kubeflow Pipelines in the Kubeflow Introduction documentation during installation steps.

  1. In the Google Cloud Console, on the Navigation menu, scroll down to AI Platform and pin the section for easier access later in the lab.

  2. Hold the pointer over AI Platform, and click Pipelines.

  3. Click New instance.
    A new tab opens.

  4. Click Configure.

  5. Select Allow access to the following Cloud APIs and then click CREATE NEW CLUSTER.

This should take 2-3 minutes to complete. Wait for the cluster to finish before proceeding to the next step.

  1. On the first tab, in the Google Cloud Console, do one of the following:

    • On the Navigation menu, click Kubernetes Engine to view the cluster being created.
    • On the Navigation menu, click Compute Engine to view the individual VMs spinning up.
  2. When the cluster creation is complete, navigate back to Deploy Kubeflow Pipelines tab and check the GCP Marketplace Terms of Service checkbox.

  3. Leave other settings unchanged, and then click Deploy. The individual services of Kubeflow Pipelines are deployed to your GKE cluster. Continue to the next task while installation occurs.

Task 3. Start a JupyterLab Notebook instance

  1. In the Google Cloud Console, on the Navigation Menu, click Vertex AI > Workbench. Select User-Managed Notebooks.

  2. On the Notebook instances page, Click Create New and choose the latest version of TensorFlow Enterprise 2.6 (with LTS) in Environment.

  3. In the New notebook instance dialog, confirm the name of the deep learning VM, if you don’t want to change the region and zone, leave all settings as they are and then click Create. The new VM will take 2-3 minutes to start.

  4. Click Open JupyterLab.
    A JupyterLab window will open in a new tab.

Note: Before clicking Open JupyterLab, make sure to enable AI Platform Training & Prediction API to avoid the pipeline run failure.

Task 4. Clone the sample code

  1. In JupyterLab, click the Terminal icon to open a new terminal.

  2. At the command-line prompt, run the following commands:

git clone https://github.com/GoogleCloudPlatform/training-data-analyst cd training-data-analyst/courses/machine_learning/deepdive2/production_ml/labs/samples
  1. To confirm that you have cloned the repository, double-click on the training-data-analyst directory and confirm that you can see its contents.

  2. Open the samples/core folder, which contains the files for all the Jupyter notebook-based examples throughout this lab.

Task 5. Prepare pipeline prerequisites

Complete a few more steps before starting the example notebooks.

  1. In the Google Cloud Console, create a Cloud Storage bucket for use in your pipelines. Name it the same as your Project ID.

  2. In the Terminal window in your notebook environment, run the following code:

export PROJECT=$(gcloud config list project --format "value(core.project)") echo "Your current GCP Project Name is: "$PROJECT export REGION=us-central1 gsutil mb -l ${REGION} gs://${PROJECT}
  1. Note your bucket name; you will need it later. It is the same as your Project ID:
gsutil ls
  1. Identify your Kubeflow Pipelines host ID.

  2. In the Google Cloud Console, on the AI Platform page, click Pipelines. Your installed pipelines instance should be listed.

  3. Click Settings, and then note the host value in the kfp.Client() method. You will use this value in the example notebook to connect to your Kubeflow Pipelines environment ENDPOINT.

Kubeflow Pipelines is fully installed when a green checkmark appears next to your pipeline instance.

  1. To open the UI, click Open pipelines dashboard.

  2. In the left navigation pane, click Pipelines and examine some of the existing sample pipelines (but don't run any). Also take note of the Experiments section. You will generally group multiple related pipeline runs in a single experiment for later comparison, although this lab will use the 'Default' experiment.

Task 6. Create and run an AI Platform pipeline

Now you will author and execute a pipeline from your AI Platform Notebook using the Kubeflow Pipelines SDK and specifically the Kubeflow Pipelines DSL (Domain Specific Language) to actually build the pipeline. This pipeline will use pre-built components that will call out to various Google Cloud services such as BigQuery and AI Platform training instead of executing all the pipeline logic in the local GKE cluster.

  1. In your AI Platform Notebook, navigate to training-data-analyst/courses/machine_learning/deepdive2/production_ml/labs/samples > core if you haven't already.

Both notebooks you will complete are located here.

  1. Open ai_platform.ipynb located in training-data-analyst/courses/machine_learning/deepdive2/production_ml/labs/samples/core/ai_platform.

  2. Complete this notebook, being sure to insert the appropriate values for project_id and the Cloud Storage bucket location.

Note: You will need to edit this cell: uncomment this step and add your Pipelines instance host id. pipeline = kfp.Client().create_run_from_pipeline_func(pipeline, arguments={}) # Run the pipeline on a separate Kubeflow Cluster instead # (use if your notebook is not running in Kubeflow - e.x. if using AI Platform Notebooks) # pipeline = kfp.Client(host='<ADD KFP ENDPOINT HERE>').create_run_from_pipeline_func(pipeline, arguments={})

Your cell should become:

# pipeline = kfp.Client().create_run_from_pipeline_func(pipeline, arguments={}) # Run the pipeline on a separate Kubeflow Cluster instead # (use if your notebook is not running in Kubeflow - e.x. if using AI Platform Notebooks) pipeline = kfp.Client(host='704d162300234e8d-dot-us-central2.pipelines.googleusercontent.com').create_run_from_pipeline_func(pipeline, arguments={})

The pipeline will take about 10 minutes to execute.

  1. After it starts, click Run details to view the job progress, and then continue with the lab. You will return to inspect it when it's done.

Task 7. Create and run a Python function-based pipeline

  1. Open lightweight_component.ipynb located in training-data-analyst/courses/machine_learning/deepdive2/production_ml/labs/samples/core > lightweight_component.

  2. Complete the steps in this notebook, modifying the kfp.Client() method as before.

  3. Click Run details to examine the pipeline output in the UI. It will take a couple of minutes to complete.

Task 8. Examine the completed AI Platform pipeline

  • Return to the completed first pipeline notebook and query your finished model.

End your lab

When you have completed your lab, click End Lab. Qwiklabs removes the resources you’ve used and cleans the account for you.

You will be given an opportunity to rate the lab experience. Select the applicable number of stars, type a comment, and then click Submit.

The number of stars indicates the following:

  • 1 star = Very dissatisfied
  • 2 stars = Dissatisfied
  • 3 stars = Neutral
  • 4 stars = Satisfied
  • 5 stars = Very satisfied

You can close the dialog box if you don't want to provide feedback.

For feedback, suggestions, or corrections, please use the Support tab.

Copyright 2022 Google LLC All rights reserved. Google and the Google logo are trademarks of Google LLC. All other company and product names may be trademarks of the respective companies with which they are associated.

Before you begin

  1. Labs create a Google Cloud project and resources for a fixed time
  2. Labs have a time limit and no pause feature. If you end the lab, you'll have to restart from the beginning.
  3. On the top left of your screen, click Start lab to begin

This content is not currently available

We will notify you via email when it becomes available

Great!

We will contact you via email if it becomes available

One lab at a time

Confirm to end all existing labs and start this one

Use private browsing to run the lab

Use an Incognito or private browser window to run this lab. This prevents any conflicts between your personal account and the Student account, which may cause extra charges incurred to your personal account.