arrow_back

Migrating SQL Server Data to BigQuery using Striim [PWDW]

Sign in Join
Get access to 700+ labs and courses

Migrating SQL Server Data to BigQuery using Striim [PWDW]

Lab 4 hours universal_currency_alt 5 Credits show_chart Intermediate
info This lab may incorporate AI tools to support your learning.
Get access to 700+ labs and courses

Overview

This lab demonstrates how to migrate SQL Server data to BigQuery using Google Cloud's data migration partner, Striim.

Striim is a comprehensive streaming ETL platform that enables online data migrations and continuous streaming replication from on-premises and cloud data sources to Google Cloud (including BigQuery, Cloud Spanner, and Cloud SQL) through a graphical drag-and-drop interface.

This lab focuses on the implementation of a batch migration from SQL Server to BigQuery, and is not an explanation of database migration or database replication or why you might want to migrate your underlying database.

To learn more about database migration and replication, review Google Cloud documentation on database migration.

What you will learn

In this lab, you will learn to:

  • Create a new BigQuery dataset as the target destination for migrated data.
  • Deploy a Striim server instance through the Google Cloud Marketplace.
  • Configure the connectors and credentials for Striim to connect to SQL Server and BigQuery.
  • Use Striim to complete an online database migration from SQL Server to BigQuery.

Setup and requirements

Before you click the Start Lab button

Note: Read these instructions.

Labs are timed and you cannot pause them. The timer, which starts when you click Start Lab, shows how long Google Cloud resources will be made available to you.

This Qwiklabs hands-on lab lets you do the lab activities yourself in a real cloud environment, not in a simulation or demo environment. It does so by giving you new, temporary credentials that you use to sign in and access Google Cloud for the duration of the lab.

What you need

To complete this lab, you need:

  • Access to a standard internet browser (Chrome browser recommended).
  • Time to complete the lab.
Note: If you already have your own personal Google Cloud account or project, do not use it for this lab. Note: If you are using a Pixelbook, open an Incognito window to run this lab.

How to start your lab and sign in to the Console

  1. Click the Start Lab button. If you need to pay for the lab, a pop-up opens for you to select your payment method. On the left is a panel populated with the temporary credentials that you must use for this lab.

  2. Copy the username, and then click Open Google Console. The lab spins up resources, and then opens another tab that shows the Choose an account page.

    Note: Open the tabs in separate windows, side-by-side.
  3. On the Choose an account page, click Use Another Account. The Sign in page opens.

  4. Paste the username that you copied from the Connection Details panel. Then copy and paste the password.

Note: You must use the credentials from the Connection Details panel. Do not use your Google Cloud Skills Boost credentials. If you have your own Google Cloud account, do not use it for this lab (avoids incurring charges).
  1. Click through the subsequent pages:
  • Accept the terms and conditions.
  • Do not add recovery options or two-factor authentication (because this is a temporary account).
  • Do not sign up for free trials.

After a few moments, the Cloud console opens in this tab.

Note: You can view the menu with a list of Google Cloud Products and Services by clicking the Navigation menu at the top-left.

Activate Google Cloud Shell

Google Cloud Shell is a virtual machine that is loaded with development tools. It offers a persistent 5GB home directory and runs on the Google Cloud.

Google Cloud Shell provides command-line access to your Google Cloud resources.

  1. In Cloud console, on the top right toolbar, click the Open Cloud Shell button.

  2. Click Continue.

It takes a few moments to provision and connect to the environment. When you are connected, you are already authenticated, and the project is set to your PROJECT_ID. For example:

gcloud is the command-line tool for Google Cloud. It comes pre-installed on Cloud Shell and supports tab-completion.

  • You can list the active account name with this command:
gcloud auth list

Output:

Credentialed accounts: - @.com (active)

Example output:

Credentialed accounts: - google1623327_student@qwiklabs.net
  • You can list the project ID with this command:
gcloud config list project

Output:

[core] project =

Example output:

[core] project = qwiklabs-gcp-44776a13dea667a6 Note: Full documentation of gcloud is available in the gcloud CLI overview guide .

Install SQL Server client

To review the source data in the SQL Server instance, you can install the SQL Server client.

  1. To install the SQL Server client in Cloud Shell, run the following commands and accept all confirmation prompts and license terms:
curl https://packages.microsoft.com/keys/microsoft.asc | sudo apt-key add - curl https://packages.microsoft.com/config/ubuntu/22.04/prod.list | sudo tee /etc/apt/sources.list.d/msprod.list sudo apt-get update sudo apt-get install mssql-tools unixodbc-dev

For Do you want to continue?, enter Y. For Do you accept the license terms?, enter yes.

  1. To set environment variables for SQL Server, run the following commands:
echo 'export PATH="$PATH:/opt/mssql-tools/bin"' >> ~/.bash_profile echo 'export PATH="$PATH:/opt/mssql-tools/bin"' >> ~/.bashrc source ~/.bashrc

Leave the browser tab with Cloud Shell open, and proceed to the next task in a new browser tab.

Task 1. Create a BigQuery dataset to receive the migrated data

In this task, you create a BigQuery dataset and table with the same schema as the source data in SQL Server. This new dataset and table will be the target destination for the migrated data from SQL Server.

Review source data in SQL Server

The SQL Server instance contains one table named Orders, which you migrate in a later task. First, connect to the SQL Server instance and review the data in the table.

  1. In the Google Cloud Console, in the Navigation menu (), click Compute Engine > VM instances.

  2. Locate the line for sqlsrvcdc, and copy the External IP address (for example, '34.72.125.251').

Return the original browser tab with Cloud Shell.

  1. To assign the SQL Server instance's IP address to a variable named SQLSRVIP, run the following command in Cloud Shell:
export SQLSRVIP='SQLSRV_IP_ADDRESS' Note: Be sure to replace SQLSRV_IP_ADDRESS with the copied external IP address and leave the outside single quotes around the value (for example, '34.72.125.251').
  1. To confirm that the variable was successfully created, run the following command:
echo $SQLSRVIP

The output will resemble the following value.

34.72.125.251
  1. To view the data in the Orders table, run the following command:
sqlcmd -S $SQLSRVIP,1433 \ -U striim -P striim-password -d striimlab \ -Q "select * from striimlab.dbo.ORDERS"

The table contains ten rows and resembles the following table structure.

ORDER_ID ORDER_DATE ORDER_MODE CUSTOMER_ID More columns...
1001 1568927976017 In-Store 1001 ...
1002 1568928036017 In-Store 1002 ...
1003 1568928096017 CompanyB 1003 ...
1004 1568928156017 CompanyA 1004 ...
1005 1568928216017 CompanyA 1005 ...
1006 1568928276017 In-Store 1006 ...
1007 1568928336017 CompanyA 1007 ...
1008 1568928396017 Online 1008 ...
1009 1568928456017 CompanyA 1009 ...
1010 1568928516017 CompanyB 1010 ...



Create BigQuery dataset and table

After reviewing the source data in SQL Server, you can create a new dataset and table in BigQuery with the same schema as the SQL Server data.

  1. To create a BigQuery dataset named striimlab, run the following in Cloud Shell:
bq --location=US mk -d \ --default_table_expiration 7200 \ --description "Striim Target Destination Dataset" striimlab

For the purposes of this lab, the dataset is set to expire in 7200 seconds (two hours).

  1. To create a destination table named orders in the new BigQuery dataset, run the following command:
bq mk \ -t \ --expiration 7200 \ --description "Striim Target Destination Table" \ striimlab.orders \ ORDER_ID:INTEGER,\ ORDER_DATE:STRING,\ ORDER_MODE:STRING,\ CUSTOMER_ID:INTEGER,\ ORDER_STATUS:INTEGER,\ ORDER_TOTAL:FLOAT,\ SALES_REP_ID:INTEGER,\ PROMOTION_ID:STRING

For the purposes of this lab, the table is also set to expire in 7200 seconds (two hours).

  1. To pause Cloud Shell and preserve the applied configurations, run the following one-word command:
cat

Leave the browser tab with Cloud Shell open, and proceed to the next task in a new browser tab.

Click Check my progress to verify the objective. Create a BigQuery dataset

Task 2. Set up a Striim server instance

In this task, you set up an instance of the Striim server through the Google Cloud Marketplace.

Create a Striim server instance

  1. In the Google Cloud Console, in the Navigation menu (), click Marketplace.

  2. In the search box type Striim and hit Return or Enter. Select STRIIM ( metered ) from the list of options.

  3. Click Get Started.

  4. Check the Box By purchasing, deploying.... under Terms and agreements.

  5. Click Agree.

  6. Click Deploy, on the pop-up window entitled Successfully agreed to terms.

  7. On the New STRIIM ( metered ) deployment screen that appears, change the Zone to .

  8. Change the Machine Type to:
    Series: E2
    Machine type: e2-standard-4 (4 vCPU, 16 GB memory).

  9. Leave the other options at their default values.

  10. Scroll down the page and Check the box for I accept the GCP Marketplace Terms of Service....

  11. Click Deploy.

Note: Marketplace solutions typically come with various resources that launch together in order to support the software. Be sure to review the monthly billing estimate before launching Striim in a regular, non-training account.
  1. You will be redirected to the Deployment Manager page and once the environment is provisioned the deployment details will appear. Take note of the name of the VM and deployment zone (Instance and Instance zone in the example screenshot below).

Leave the Deployment Manager page open, and proceed to the next task in the original browser tab with Cloud Shell.

Click Check my progress to verify the objective. Set up a Striim server instance

Configure variables containing the Striim server details

  1. To unpause Cloud Shell, press Ctrl+C.

  2. To create a variable containing the Striim server instance name, run the following command and replace INSTANCENAME with the instance name provided on the Deployment Manager page (for example, striim-metered-1-vm):

export STRIIMVM_NAME=INSTANCENAME
  1. To create a variable containing the Striim server instance zone, run the following command and replace ZONENAME with the instance zone provided on the Deployment Manager page (for example, us-central1-f):
export STRIIMVM_ZONE=ZONENAME
  1. To confirm that the variables were successfully created, run the following commands:
echo $STRIIMVM_NAME echo $STRIIMVM_ZONE

The output will resemble the following values.

striim-metered-1-vm {{{project_0.default_zone|Zone}}}
  1. To pause Cloud Shell and preserve the applied configurations, run the following one-word command:
cat

Leave the browser tab with Cloud Shell open, and proceed to the next task using the Deployment Manager page.

Configure the Striim server

  1. On the Deployment Manager page, within the Striim server instance details, click the link for Site address.

    The Striim configuration wizard will open in a new browser tab. Leave the Deployment Manager page open.

    If there is a redirect notice, click the link provided. If a window opens to confirm the use of cookies, click I Agree.

Notes: If you see the pop-up the connection is not secure. Click Continue to Site.
  1. Click Accept Striim EULA and Continue.

  2. To configure your Striim server instance, enter the following values into the form:

    Property Value
    Cluster name my_cluster
    Admin password striim-password
    Re-enter Admin password striim-password
    Sys user password striim-password
    Re-enter Sys user password striim-password
    Striim Key Store password striim-password
    Re-enter Striim Key Store password striim-password



    Note: Do not use spaces or other punctuation in the form. Be sure to not include additional spaces before or after the input values.
  3. Click Save and Continue.

  4. Click Launch.

    Processing will occur in the background and may take a few minutes to complete. After configuration is complete, the Log in button will appear.

  5. Click Log in.

  6. Enter the following login details:

    Property Value
    Username admin
    Password striim-password



    Important: The "admin" user is a built-in user that must be used for a successful login.
  7. Click Log in.

    After a successful login, the Striim administration console will appear.

Leave the Striim administration console page open, and proceed to the next task using the Deployment Manager page.

Task 3. Set up connector between Striim and SQL Server instance

In this task, you use the SQL Server JDBC driver to allow Striim to connect to your SQL Server instance.

  1. On the Deployment Manager page, within the Striim server instance details, click SSH (located next to Visit the site).

    A new SSH terminal window for the virtual machine will open.

Note: If you encounter a warning of "Please consider adding the IAP-secured Tunnel User IAM role to start using Cloud IAP for TCP forwarding for better performance", you can ignore the warning and proceed.
  1. To download the SQL Server JDBC driver, run the following command in the new SSH terminal:
sudo wget https://storage.googleapis.com/cloud-training/dbmigration/mssql-jdbc-7.2.2.jre8.jar
  1. To copy the java archive file to the Striim library path and allow it to be executable, run the following commands:
sudo cp mssql-jdbc-7.2.2.jre8.jar /opt/striim/lib/mssql-jdbc-7.2.2.jre8.jar sudo chmod +x /opt/striim/lib/mssql-jdbc-7.2.2.jre8.jar sudo chown striim /opt/striim/lib/mssql-jdbc-7.2.2.jre8.jar sudo chgrp striim /opt/striim/lib/mssql-jdbc-7.2.2.jre8.jar
  1. To restart the Striim server so that it recognizes the new JDBC driver, run the following commands:
sudo systemctl stop striim-node sudo systemctl stop striim-dbms sudo systemctl start striim-dbms sudo systemctl start striim-node
  1. Close the Striim SSH terminal window.

  2. Return to the browser tab for the Striim administration console, and refresh the page.

Note: The administration console takes a few minutes to refresh after the restart. If the administration console does not reload on the first attempt, keep refreshing the page until the administration console is available for login.

If the administration console page does not resolve after 5 minutes, use the site address http://STRIIM_IP_ADDRESS:9080/, replacing STRIIM_IP_ADDRESS with the IP address provided on the Deployment Manager page (for example, http://104.154.159.38:9080).

  1. Log in to the administration console again using the following login details:

    Property Value
    Username admin
    Password striim-password



  2. Click Log in.

Leave the Striim administration console page open, and proceed to the next task in the original browser tab with Cloud Shell.

Click Check my progress to verify the objective. Set up connector between Striim and SQL Server instance

Task 4. Configure credentials for Striim to write data to BigQuery

In this task, you create service account credentials that Striim can use to write to the BigQuery table.

  1. To unpause Cloud Shell, press Ctrl+C.

  2. To create a variable for the Google Cloud project name, run the following command:

export PROJECT=$(gcloud info --format='value(config.project)')
  1. To create a service account for Striim to connect to BigQuery, run the following command:
gcloud iam service-accounts create striim-bq --display-name striim-bq
  1. To create an environment variable for the service account that you created in the previous step, run the following command:
export sa_striim_bq=$(gcloud iam service-accounts list --filter="displayName:striim-bq" --format='value(email)')
  1. To confirm that the variables were successfully created, run the following command:
echo $PROJECT echo $sa_striim_bq

The output will resemble the following values.

qwiklabs-gcp-00-444c1562d2ad striim-bq@qwiklabs-gcp-00-444c1562d2ad.iam.gserviceaccount.com

Note the Google Cloud project name for use in a later task (for example, qwiklabs-gcp-00-444c1562d2ad).

  1. To add policy bindings for the service account to access BigQuery, run the following commands:
gcloud projects add-iam-policy-binding $PROJECT --role roles/bigquery.dataEditor --member serviceAccount:$sa_striim_bq gcloud projects add-iam-policy-binding $PROJECT --role roles/bigquery.user --member serviceAccount:$sa_striim_bq
  1. To create a service account key named striim-bq-key.json, run the following command:
gcloud iam service-accounts keys create ~/striim-bq-key.json --iam-account $sa_striim_bq
  1. To copy the JSON file to the Striim server, run the following command:
gcloud compute scp ~/striim-bq-key.json $STRIIMVM_NAME:~ --zone=$STRIIMVM_ZONE

If you are asked to set a passphrase for a new SSH key, press ENTER to leave the passphrase empty.

Important: If you encounter a workgroup error in Cloud Shell, you can safely ignore the warning and proceed.
  1. To move the JSON file to the /opt/striim directory, run the following command:
gcloud compute ssh --zone=$STRIIMVM_ZONE $STRIIMVM_NAME \ -- 'sudo cp ~/striim-bq-key.json /opt/striim && sudo chown striim /opt/striim/striim-bq-key.json'
  1. To confirm that file was moved to correct directory, run the following command:
gcloud compute ssh --zone=$STRIIMVM_ZONE $STRIIMVM_NAME \ -- 'sudo ls -l /opt/striim/striim-bq-key.json'

The output will resemble the following.

-rw-------. 1 striim root 2357 Jul 21 22:14 /opt/striim/striim-bq-key.json Connection to 35.239.92.36 closed.

You can now create a Striim application for the data migration!

  1. To pause Cloud Shell and preserve the applied configurations, run the following one-word command:
cat

Leave the browser tab with Cloud Shell open, and proceed to the next task using the Striim application console page.

Click Check my progress to verify the objective. Configure credentials for Striim to write data to BigQuery

Task 5. Create and configure an application for online database migration

An online database migration moves data from a source database (either on-premises or hosted on a cloud provider) to a target database in Google Cloud. The source database remains fully accessible to business applications with minimal performance impact on the source database with this type of configuration.

An online migration can be done as a one-time batch load or as a continuous load and capture of changes. In this task, you create and run a one-time batch migration of data from SQL Server to BigQuery.

Create the Striim application

  1. On the Striim administration console, click Apps > Create New.

  2. Click Start from scratch.

  3. For Name, enter Batch

  4. For Namespace, select admin.

The default namespace in this lab is admin. Namespaces are logical groupings that help you organize applications.

  1. Click Save.

The Flow Designer page will load. The standard connectors to create data applications are in the left side menu bar.

  1. To do a one-time batch load of data, drag the Database tile from Sources to the center of the page.

  2. Enter the following SQL Server connection properties in the New Source pane.

Note: Be sure to not include additional spaces before or after the input values. Replace the placeholder for EXTERNAL_IP_ADDRESS with the external IP address of SQL Server instance (for example, 34.172.20.145). You accessed the external IP address in Task 1 of this lab.
Property Value
Name batch_source
Connection URL jdbc:sqlserver://EXTERNAL_IP_ADDRESS:1433;DatabaseName=striimlab
Username striim
Password striim-password



  1. Click Show Advanced Settings.

  2. For Tables, enter striimlab.dbo.ORDERS

  3. Click Hide Advanced Settings.

  4. For Output To, select New Output, and enter batch_inprocess

  5. Click Save.

When you see the message Component saved, the DatabaseReader component called batch_source has been saved successfully.

Test the connection to SQL Server

  1. To test the configuration settings, click Created, and select Deploy App.

  2. Select default, and click Deploy.

  3. Click the wave icon for batch_inprocess, and then click the eye icon for Preview on run.

  1. To start the application, click Deployed, and select Start App.

You will now see the Striim application running and data flowing through the pipeline.

If there are any errors, review the details you provided in the previous section for the DatabaseReader component called batch_source.

You have successfully connected to your source SQL Server database and tested that Striim can read the data.

  1. To stop the application, click Running, and select Stop App.

  2. Click Stopped, and then select Undeploy App.

After the successful test connection to the source SQL Server database, you can now connect the workflow to the target destination table in BigQuery.

Connect the destination table in BigQuery

Note: For this section, be sure to not include additional spaces before or after the input values.
  1. Click the wave icon for batch_inprocess.

  2. Click the plus sign (+) icon, and select Connect next Target component.

  3. For Name, enter BigQueryInitialLoadTarget

  4. For Adapter, search for BigQuery, and select BigQueryWriter.

  5. For Tables, enter striimlab.dbo.ORDERS,striimlab.orders

The Tables property is a list of source-target pairs separated by semicolons. Each pair denotes the full names of the source and target tables separated by commas. If you must specify more than one table, the list typically resembles the following:

srcSchema1.srcTable1,tgtSchema1.tgtTable1; srcSchema2.srcTable2,tgtSchema2.tgtTable2
  1. For Project_ID, enter the Google Cloud project name (for example, qwiklabs-gcp-00-5222d70993ea).

The Google Cloud project name was provided in step 5 of Task 4 and is also available on the left side panel of the current lab instruction page. You can also return to the Dashboard homepage of your Google Cloud project to see the project name.

  1. Click Show Advanced Settings.

  2. For Service Account Key, enter /opt/striim/striim-bq-key.json

The Service Account Key property is a fully qualified path and name of the key file you created and uploaded in a previous task.

  1. Click Save.

Task 6. Run the online database migration and review the migrated data in BigQuery

Deploy the Striim application

  1. To deploy the application, click Created, and select Deploy App.

  2. Select default, and click Deploy.

  3. To start the application, click Deployed, and select Start App.

Important: Due to the default configuration settings, it may take a few minutes for the transactions to appear.

Test that the data migration was successful

  1. In the Google Cloud Console, in the Navigation menu (), click BigQuery.

  2. To see the dataset that you created in a previous task, click the project name in Explorer pane (for example, qwiklabs-gcp-00-5222d70993ea) to expand the dataset list.

  3. To see the table that you created in a previous task, click striimlab to expand the table list.

  4. For the orders table, select Query.

  5. Enter the following SQL query into the Query editor window, and click Run.

SELECT COUNT(*) AS ORDERS, AVG(ORDER_TOTAL) AS ORDERS_AVE, SUM(ORDER_TOTAL) AS ORDERS_SUM FROM striimlab.orders;

The output should resemble the following:

Row ORDERS ORDERS_AVE ORDERS_SUM 1 10 43148.957 431489.56999999995

Click Check my progress to verify the objective. Review the migrated data in BigQuery

Stop the Striim application

  1. To stop the application, on the Flow Designer page, click Running, and then select Stop App.

  2. Click Stopped, and then select Undeploy App.

Congratulations!

You have successfully set up a migration pipeline from SQL Server to BigQuery using Striim to complete the online database migration.

Next Steps/Learn More

End your lab

When you have completed your lab, click End Lab. Google Cloud Skills Boost removes the resources you’ve used and cleans the account for you.

You will be given an opportunity to rate the lab experience. Select the applicable number of stars, type a comment, and then click Submit.

The number of stars indicates the following:

  • 1 star = Very dissatisfied
  • 2 stars = Dissatisfied
  • 3 stars = Neutral
  • 4 stars = Satisfied
  • 5 stars = Very satisfied

You can close the dialog box if you don't want to provide feedback.

For feedback, suggestions, or corrections, please use the Support tab.

Last Tested Date: September 22, 2022

Last Updated Date: September 22, 2022

Copyright 2022 Google LLC All rights reserved. Google and the Google logo are trademarks of Google LLC. All other company and product names may be trademarks of the respective companies with which they are associated.

Before you begin

  1. Labs create a Google Cloud project and resources for a fixed time
  2. Labs have a time limit and no pause feature. If you end the lab, you'll have to restart from the beginning.
  3. On the top left of your screen, click Start lab to begin

This content is not currently available

We will notify you via email when it becomes available

Great!

We will contact you via email if it becomes available

One lab at a time

Confirm to end all existing labs and start this one

Use private browsing to run the lab

Use an Incognito or private browser window to run this lab. This prevents any conflicts between your personal account and the Student account, which may cause extra charges incurred to your personal account.