Quick tip: Review the prerequisites before you run the lab

Use an Incognito or private browser window to run this lab. This prevents any conflicts between your personal account and the student account, which may cause extra charges incurred to your personal account.

Testez vos connaissances et partagez-les avec notre communauté

done

Accédez à plus de 700 ateliers pratiques, badges de compétence et cours

Vertex AI: Predicting Loan Risk with AutoML

Atelier 1 heure universal_currency_alt 5 crédits show_chart Débutant

info Cet atelier peut intégrer des outils d'IA pour vous accompagner dans votre apprentissage.

Overview
Setup
Introduction to Vertex AI
Task 1. Prepare the training data
Task 2. Train your model
Task 3. Evaluate the model performance (demonstration only)
Task 4. Deploy the model (demonstration only)
Task 5. Get predictions
Congratulations!
End your lab

Testez vos connaissances et partagez-les avec notre communauté

done

Accédez à plus de 700 ateliers pratiques, badges de compétence et cours

Overview

In this lab, you use Vertex AI to train and serve a machine learning model to predict loan risk with a tabular dataset.

Objectives

You learn how to:

Upload a dataset to Vertex AI.
Train a machine learning model with AutoML.
Evaluate the model performance.
Deploy the model to an endpoint.
Get predictions.

Setup

Before you click the Start Lab button

Note: Read these instructions.

Labs are timed and you cannot pause them. The timer, which starts when you click Start Lab, shows how long Google Cloud resources will be made available to you.

This Qwiklabs hands-on lab lets you do the lab activities yourself in a real cloud environment, not in a simulation or demo environment. It does so by giving you new, temporary credentials that you use to sign in and access Google Cloud for the duration of the lab.

What you need

To complete this lab, you need:

Access to a standard internet browser (Chrome browser recommended).
Time to complete the lab.

Note: If you already have your own personal Google Cloud account or project, do not use it for this lab.

Note: If you are using a Pixelbook, open an Incognito window to run this lab.

How to start your lab and sign in to the Console

Click the Start Lab button. If you need to pay for the lab, a pop-up opens for you to select your payment method. On the left is a panel populated with the temporary credentials that you must use for this lab.
Copy the username, and then click Open Google Console. The lab spins up resources, and then opens another tab that shows the Choose an account page.
Note: Open the tabs in separate windows, side-by-side.
On the Choose an account page, click Use Another Account. The Sign in page opens.
Paste the username that you copied from the Connection Details panel. Then copy and paste the password.

Note: You must use the credentials from the Connection Details panel. Do not use your Google Cloud Skills Boost credentials. If you have your own Google Cloud account, do not use it for this lab (avoids incurring charges).

Click through the subsequent pages:

Accept the terms and conditions.
Do not add recovery options or two-factor authentication (because this is a temporary account).
Do not sign up for free trials.

After a few moments, the Cloud console opens in this tab.

Note: You can view the menu with a list of Google Cloud Products and Services by clicking the Navigation menu at the top-left. Cloud Console Menu

Introduction to Vertex AI

This lab uses Vertex AI, the unified AI platform on Google Cloud to train and deploy a ML model. Vertex AI offers two options on one platform to build a ML model: a codeless solution with AutoML and a code-based solution with Custom Training using Vertex Workbench. You use AutoML in this lab.

In this lab you build a ML model to determine whether a particular customer will repay a loan.

Task 1. Prepare the training data

The initial Vertex AI dashboard illustrates the major stages to train and deploy a ML model: prepare the training data, train the model, and get predictions. Later, the dashboard displays your recent activities, such as the recent datasets, models, predictions, endpoints, and notebook instances.

Create a dataset

In the Google Cloud console, on the Navigation menu, click Vertex AI > Datasets.
Click Create dataset.
Give dataset a name LoanRisk.
For the data type and objective, click Tabular, and then select Regression/classification.
Click Create.

Upload data

There are three options to import data in Vertex AI:

Upload CSV files from your computer.
Select CSV files from Cloud Storage.
Select a table or view from BigQuery.

For convenience, the dataset is already uploaded to Cloud Storage.

For the data source, select Select CSV files from Cloud Storage.
For Import file path, enter:

spls/cbl455/loan_risk.csv

Click Continue.

Note: You can also configure this page by clicking Datasets on the left menu and then selecting the dataset name on the Datasets page.

(Optional) Generate statistics

To see the descriptive statistics for each column of your dataset, click Generate statistics .
Generating the statistics might take a few minutes, especially the first time.
When the statistics are ready, click each column name to display analytical charts.

Task 2. Train your model

With a dataset uploaded, you're ready to train a model to predict whether a customer will repay the loan.

Click Train new model and select Other .

Training method

The dataset is already named LoanRisk.
For Objective, select Classification.

You select classification instead of regression because you are predicting a distinct number (whether a customer will repay a loan: 0 for repay, 1 for default/not repay) instead of a continuous number.

Click Continue.

Model details

Specify the name of the model and the target column.

Give the model a name, such as LoanRisk.
For Target column, select Default .
(Optional) Explore Advanced options to determine how to assign the training vs. testing data and specify the encryption.
Click Continue.
For Add features, click Continue.

Training options

Specify which columns you want to include in the training model. For example, ClientID might be irrelevant to predict loan risk.

Click the minus sign on the ClientID row to exclude it from the training model.
(Optional) Explore Advanced options to select different optimization objectives.
For more information about optimization objectives for tabular AutoML models, refer to the Optimization objectives for tabular AutoML models guide.
Click Continue.

Compute and pricing

For Budget, which represents the number of node hours for training, enter 1.
Training your AutoML model for 1 compute hour is typically a good start for understanding whether there is a relationship between the features and label you've selected. From there, you can modify your features and train for more time to improve model performance.
Leave early stopping Enabled.
Click Start training.

Depending on the data size and the training method, the training can take from a few minutes to a couple of hours. Normally you would receive an email from Google Cloud when the training job is complete. However, in the Qwiklabs environment, you will not receive an email.

Note: To eliminate the typical hour-long wait for model training, download a pretrained model in Task 5. This model is the result of Tasks 1 and 2. Tasks 3 and 4 are only for demonstration and apply if you train the model yourself.

Task 3. Evaluate the model performance (demonstration only)

Vertex AI provides many metrics to evaluate the model performance. You focus on three:

Precision/Recall curve
Confusion Matrix
Feature Importance

Note: If you had a model trained, you could navigate to the Model Registry tab in Vertex AI.

1. Navigate to the Model Registry.

2. Click on the model you just trained.

3. Browse the Evaluate tab.

However in this lab, you can skip this step since you use a pre-trained model.

The precision/recall curve

Confidence threshold slider set to 0.5 and graphs for precision/recall curve, ROC curve, and Precision-recall by threshold

The confidence threshold determines how a ML model counts the positive cases. A higher threshold increases the precision, but decreases recall. A lower threshold decreases the precision, but increases recall.

You can manually adjust the threshold to observe its impact on precision and recall and find the best tradeoff point between the two to meet your business needs.

The confusion matrix

A confusion matrix tells you the percentage of examples from each class in your test set that your model predicted correctly.

Confusion matrix table displaying true label and predicted label classifications

The confusion matrix shows that your initial model is able to predict 100% of the repay examples and 87% of the default examples in your test set correctly, which is not too bad.

You can improve the percentage by adding more examples (more data), engineering new features, and changing the training method, etc.

The feature importance

In Vertex AI, feature importance is displayed through a bar chart to illustrate how each feature contributes to a prediction. The longer the bar, or the larger the numerical value associated with a feature, the more important it is.

Feature importance bar chart for loan, income, and age

These feature importance values could be used to help you improve your model and have more confidence in its predictions. You might decide to remove the least important features next time you train a model or to combine two of the more significant features into a feature cross to see if this improves model performance.

Feature importance is just one example of Vertex AI’s comprehensive machine learning functionality called Explainable AI. Explainable AI is a set of tools and frameworks to help understand and interpret predictions made by machine learning models.

Task 4. Deploy the model (demonstration only)

Note: You will not deploy the model to an endpoint because the model training can take an hour. Here you can review the steps you would perform in a production environment.

Now that you have a trained model, the next step is to create an endpoint in Vertex. A model resource in Vertex can have multiple endpoints associated with it, and you can split traffic between endpoints.

Create and define an endpoint

On your model page, click Deploy & test, and then click Deploy to Endpoint.
For Endpoint name, type LoanRisk
Click Continue.

Model settings and monitoring

Leave the traffic splitting settings as-is.
For Machine type, select e2-standard-8, 8 vCPUs, 32 GiB memory.
For Explainability Options, click Feature attribution.
Click Done.
Click Continue.
In Model monitoring, click Continue.
In Model objectives > Training data source, select Vertex AI dataset.
Select your dataset from the drop down menu.
In Target column, type Default
Leave the remaining settings as-is and click Deploy.

Your endpoint will take a few minutes to deploy. When it is completed, a green check mark will appear next to the name.

Now you're ready to get predictions on your deployed model.

Task 5. Get predictions

In this section, use the AutoML-Gateway to work with an existing trained model.

ENVIRONMENT VARIABLE	VALUE
Credit_Risk ENDPOINT	1411183591831896064
INPUT_DATA_FILE	INPUT-JSON

To use the trained model, you will need to create some environment variables.

Open a Cloud Shell window.
Download the lab assets:

gcloud storage cp gs://cloud-training/CBL455/INPUT-JSON .

Create a INPUT_DATA_FILE environment variable:

export INPUT_DATA_FILE="INPUT-JSON"

Create a PROJECT_NUMBER environment variable:

export PROJECT_NUMBER=$(gcloud projects describe $(gcloud config get-value project) --format="value(projectNumber)")

Create a AUTOML_SERVICE environment variable:

export AUTOML_SERVICE="https://automl-proxy-$PROJECT_NUMBER.us-central1.run.app/v1"

Note: After the lab assets are extracted, take a moment to review the contents.

The INPUT-JSON file is used to provide Vertex AI with the model data required. Alter this file to generate custom predictions.

The file INPUT-JSON is composed of the following values:

{ "instances": [ { "age": 40.77430558, "ClientID": "997", "income": 44964.0106, "loan": 3944.219318 } ] }

Enter the following command to request a prediction:

curl -X POST -H "Content-Type: application/json" $AUTOML_SERVICE -d "@${INPUT_DATA_FILE}" -s | jq

Expected Output:

{ "predictions": [ { "scores": [ 0.9999980926513672, 0.000001897001311590429 ], "classes": [ "0", "1" ] } ], "deployedModelId": "3093594712003575808", "model": "projects/1030115194620/locations/us-central1/models/4831874217005809664", "modelDisplayName": "credit_risk_20211119212817", "modelVersionId": "1" }

If you use the Google Cloud console, the following image illustrates how the same action could be performed:

Prediction steps highlighted in the relevant sections

Congratulations!

You can now use Vertex AI to:

Upload a dataset.
Train a model with AutoML.
Evaluate the model performance.
Deploy the trained AutoML model to an endpoint.
Get predictions.

To learn more about different parts of Vertex AI, refer to the Vertex AI documentation.

End your lab

When you have completed your lab, click End Lab. Google Cloud Skills Boost removes the resources you’ve used and cleans the account for you.

You will be given an opportunity to rate the lab experience. Select the applicable number of stars, type a comment, and then click Submit.

The number of stars indicates the following:

1 star = Very dissatisfied
2 stars = Dissatisfied
3 stars = Neutral
4 stars = Satisfied
5 stars = Very satisfied

You can close the dialog box if you don't want to provide feedback.

For feedback, suggestions, or corrections, please use the Support tab.

Copyright 2022 Google LLC All rights reserved. Google and the Google logo are trademarks of Google LLC. All other company and product names may be trademarks of the respective companies with which they are associated.

Vertex AI: Predicting Loan Risk with AutoML

Vertex AI: Predicting Loan Risk with AutoML

Overview

Objectives

Setup

Before you click the Start Lab button

What you need

How to start your lab and sign in to the Console

Introduction to Vertex AI

Task 1. Prepare the training data

Create a dataset

Upload data

(Optional) Generate statistics

Task 2. Train your model

Training method

Model details

Training options

Compute and pricing

Task 3. Evaluate the model performance (demonstration only)

The precision/recall curve

The confusion matrix

The feature importance

Task 4. Deploy the model (demonstration only)

Create and define an endpoint

Model settings and monitoring

Task 5. Get predictions

Congratulations!

End your lab

Before you begin

Use private browsing

Sign in to the Console

Use private browsing to run the lab