BigQuery Fundamentals for Oracle Professionals

Course · 16 hours

< 1%

complete

Get access to 700+ labs and courses

Working with SQL in BigQuery for Oracle Professionals

Lab 1 hour 30 minutes universal_currency_alt 5 Credits show_chart Introductory

info This lab may incorporate AI tools to support your learning.

Overview
Objectives
Setup and requirements
Task 1. Create BigQuery dataset and tables using DDL statements
Task 2. Update BigQuery table data using DML statements
Task 3. Join data and write CTEs using SQL SELECT statements
Task 4. Create new tables and views using DDL statements
Task 5. Define custom UDFs and stored procedures
End your lab

Get access to 700+ labs and courses

Overview

BigQuery supports many types of SQL statements including data definition language (DDL), data manipulation language (DML), user-defined functions (UDFs), and stored procedures.

The objective of this lab is to provide Oracle professionals with the necessary knowledge and skills to begin working with SQL in BigQuery. Upon completion of this lab, Oracle professionals will have a deeper understanding of how to use SQL in BigQuery to create, update, and work with data structures in BigQuery.

In this lab, you create tables and views using DDL statements, update tables using DML statements, join data using SQL, and define custom user-defined functions (UDFs) and stored procedures.

For more information on translating SQL from Oracle to BigQuery, review the Oracle to BigQuery SQL translation guide: https://cloud.google.com/bigquery/docs/migration/oracle-sql.

Objectives

In this lab, you learn how to:

Create new tables and views using DDL statements.
Update existing table data using DML statements.
Join data and define common table expressions (CTEs) using SQL SELECT statements.
Define custom UDFs and stored procedures.

Setup and requirements

For each lab, you get a new Google Cloud project and set of resources for a fixed time at no cost.

Sign in to Qwiklabs using an incognito window.
Note the lab's access time (for example, 1:15:00), and make sure you can finish within that time.
There is no pause feature. You can restart if needed, but you have to start at the beginning.
When ready, click Start lab.
Note your lab credentials (Username and Password). You will use them to sign in to the Google Cloud Console.
Click Open Google Console.
Click Use another account and copy/paste credentials for this lab into the prompts.
If you use other credentials, you'll receive errors or incur charges.
Accept the terms and skip the recovery resource page.

How to start your lab and sign in to the Console

Click the Start Lab button. If you need to pay for the lab, a pop-up opens for you to select your payment method. On the left is a panel populated with the temporary credentials that you must use for this lab.
Copy the username, and then click Open Google Console. The lab spins up resources, and then opens another tab that shows the Choose an account page.
Note: Open the tabs in separate windows, side-by-side.
On the Choose an account page, click Use Another Account. The Sign in page opens.
Paste the username that you copied from the Connection Details panel. Then copy and paste the password.

Note: You must use the credentials from the Connection Details panel. Do not use your Google Cloud Skills Boost credentials. If you have your own Google Cloud account, do not use it for this lab (avoids incurring charges).

Click through the subsequent pages:

Accept the terms and conditions.
Do not add recovery options or two-factor authentication (because this is a temporary account).
Do not sign up for free trials.

After a few moments, the Cloud console opens in this tab.

Note: You can view the menu with a list of Google Cloud Products and Services by clicking the Navigation menu at the top-left.

Task 1. Create BigQuery dataset and tables using DDL statements

In BigQuery, you can use data definition language (DDL) to create datasets and tables. You can also use the SQL statement LOAD DATA to load data from one or more files into a new or existing table.

To learn more about using DDL statements to create BigQuery datasets and tables and using the LOAD DATA SQL statement to load data, review the documentation titled CREATE SCHEMA statement, CREATE TABLE statement, and LOAD DATA statement.

In this task, you use DDL to create a dataset and tables in BigQuery, and then load data into the new tables using the LOAD DATA statement.

In the Google Cloud console, in the Navigation menu (), click BigQuery.

The Welcome to BigQuery in the Cloud Console message box opens. This message box provides a link to the quickstart guide and the release notes.

Click Done.

In the Studio , click the Editor icon to open the SQL code editor.

In the query editor, copy and paste the following query, and click Run:

CREATE SCHEMA IF NOT EXISTS animals_dataset OPTIONS( location="us");

This query creates a new BigQuery dataset named animals_dataset. Note that the DDL statement uses the term SCHEMA to refer to a logical collection of tables, views, and other resources, which in BigQuery is known as a dataset.

In the query editor, run the following queries to create two new tables:

CREATE TABLE animals_dataset.owners( OwnerID INT64 NOT NULL, Name STRING);

CREATE TABLE animals_dataset.pets( PetID INT64 NOT NULL, OwnerID INT64 NOT NULL, Type STRING, Name STRING, Weight FLOAT64);

These queries create two tables called owners and pets in the BigQuery dataset you previously created named animals_dataset.

Before moving to the next step, update the field mode from REQUIRED to NULLABLE in the table schema to avoid errors in step 5.

In the Explorer panel, select the table name, choose SCHEMA tab, click EDIT SCHEMA, then update the field mode from REQUIRED to NULLABLE and click Save. Please repeat this step for all the fields with mode "REQUIRED" in both tables.

In the query editor, run the following queries:

LOAD DATA INTO animals_dataset.owners FROM FILES ( skip_leading_rows=1, format = 'CSV', field_delimiter = ',', uris = ['gs://tcd_repo/data/environmental/animals/owners.csv']);

LOAD DATA INTO animals_dataset.pets FROM FILES ( skip_leading_rows=1, format = 'CSV', field_delimiter = ',', uris = ['gs://tcd_repo/data/environmental/animals/pets.csv']);

These queries load data into the owners and pets tables from CSV files in Cloud Storage.

The Results pane displays a message that the LOAD statement ran successfully.

Expand the Explorer pane (left side panel) which lists the dataset and table, and then click on the table name owners.
Click on the Details and Preview tabs to see more information about the table and a preview of the data.

You can click Refresh (top right) to refresh the data in the Preview tab.

Repeat steps 6-7 to see more information about the table named pets and a preview of the data.

Click Check my progress to verify the objective. Create BigQuery dataset and tables

Task 2. Update BigQuery table data using DML statements

In BigQuery, you can use DML to update the data in an existing table, including adding, modifying, and deleting data from your BigQuery tables.

To learn more about using DML statements to work with data in BigQuery tables, review the documentation titled Data manipulation language (DML) statements in GoogleSQL.

In this task, you use DML to insert, update, and delete data from an existing BigQuery table.

In the query editor, run the following query to add a new owner named Mary:

INSERT INTO animals_dataset.owners (OwnerID, Name) VALUES (9, 'Mary');

In the query editor, run the following queries to add Mary's two dogs:

INSERT INTO animals_dataset.pets (PetID, OwnerID, Type, Name, Weight) VALUES (28, 9, 'Dog', "George", 50);

INSERT INTO animals_dataset.pets (PetID, OwnerID, Type, Name, Weight) VALUES (29, 9, 'Dog', "Washington", 60);

After each insert query, the Results pane displays that one record has been added to the pets table.

In the query editor, run the following query to check that Mary and her two dogs were added to the appropriate tables:

SELECT o.Name, p.Type, p.Name FROM animals_dataset.owners o JOIN animals_dataset.pets p ON o.OwnerID = p.OwnerID WHERE o.Name = 'Mary';

The record for Mary in the owners table has been joined to the two records in the pets table for her dogs: George and Washington.

In the query editor, run the following query to update all values of the animal type "Dog" to be "Canine":

UPDATE animals_dataset.pets SET Type = 'Canine' WHERE Type = 'Dog';

This statement modifies 10 rows in the pets table.

Click on Go To Table, and click on the Preview tab.

Verify that all the dogs are now identified as canines in the pets table.

You can click Refresh (top right) to refresh the data in the Preview tab.

In the query editor, run the following query to delete all values of the animal type "Frog":

DELETE FROM animals_dataset.pets WHERE Type = 'Frog';

This statement removes 1 row from the pets table.

Click on Go To Table, and click on the Preview tab.

Verify that all frogs have been removed from the pets table.

You can click Refresh (top right) to refresh the data in the Preview tab.

Click Check my progress to verify the objective. Update BigQuery table data using DML statements

Task 3. Join data and write CTEs using SQL SELECT statements

In BigQuery, you can write SQL SELECT statements with syntax for joins, CTEs, ordering, grouping, filtering, pivoting, windowing, and more, to retrieve the data that you need.

To learn more about SQL SELECT syntax for working with data in BigQuery tables, review the documentation titled Query syntax.

In this task, you write SQL SELECT statements that include JOIN operations to join multiple tables and WITH clauses to define CTEs.

In the query editor, run the following query with a JOIN to select all owners and their pets:

SELECT o.Name, p.Type, p.Name, p.Weight FROM animals_dataset.owners o JOIN animals_dataset.pets p ON o.OwnerID = p.OwnerID;

In the query editor, run the same query with a WHERE clause to only select the canines:

SELECT o.Name, p.Type, p.Name, p.Weight FROM animals_dataset.owners o JOIN animals_dataset.pets p ON o.OwnerID = p.OwnerID WHERE p.Type = "Canine";

In the query editor, run the same query with an ORDER BY to sort the results by the owner's name.

SELECT o.Name, p.Type, p.Name, p.Weight FROM animals_dataset.owners o JOIN animals_dataset.pets p ON o.OwnerID = p.OwnerID WHERE p.Type = "Canine" ORDER BY o.Name ASC;

In the query editor, run the following query to count the pets by type:

SELECT type, COUNT(*) AS count FROM animals_dataset.pets GROUP BY type ORDER BY count DESC;

There are 10 canines, 5 cats, 2 pigs, and 2 turtles in the pets table.

In the query editor, run the following query to count the pets by owner.

SELECT o.Name, COUNT(p.Name) AS count FROM animals_dataset.owners o JOIN animals_dataset.pets p ON o.OwnerID = p.OwnerID GROUP BY o.Name ORDER BY count DESC;

The owner named Doug has the most pets with 4 total pets.

In the query editor, run the following query to return the pets' information as a nested and repeated field:

SELECT o.OwnerID, o.Name AS OwnerName, ARRAY_AGG(STRUCT( p.Name AS PetName, p.Type, p.Weight)) AS Pets FROM animals_dataset.owners AS o JOIN animals_dataset.pets AS p ON o.OwnerID = p.OwnerID GROUP BY o.OwnerID, o.Name;

Note: In BigQuery, nested and repeated fields are stored as ARRAY`s of STRUCT`s. The ARRAY_AGG(STRUCT…) syntax provides the results as nested and repeated values and makes it easier to see the relationships between the data because joined values are organized clearly.

Another useful SQL option in BigQuery is to use a WITH clause to define CTEs and query the results of another query. Using this syntax, you can avoid using nested SQL statements and make your code easier to read.

In the query editor, run the following query to define a CTE based on the previous query:

WITH owners_pets AS (SELECT o.OwnerID, o.Name AS OwnerName, ARRAY_AGG(STRUCT( p.Name AS PetName, p.Type, p.Weight)) AS Pets FROM animals_dataset.owners AS o JOIN animals_dataset.pets AS p ON o.OwnerID = p.OwnerID GROUP BY o.OwnerID, o.Name) SELECT op.OwnerName, op.Pets FROM owners_pets AS op;

Click Check my progress to verify the objective. Join data and write CTEs using SQL SELECT statements

Task 4. Create new tables and views using DDL statements

In a previous task, you used DDL to create new BigQuery datasets and tables. In BigQuery, you can also use DDL to create both logical views and materialized views.

To learn more about using DDL statements to create views in BigQuery, review the documentation titled Introduction to views.

In this task, you use DDL to create new tables, logical views, and materialized views.

In the query editor, run the following code to write the results of a query to new table:

CREATE OR REPLACE TABLE animals_dataset.owners_pets AS ( SELECT o.OwnerID, o.Name AS OwnerName, ARRAY_AGG(STRUCT( p.PetID, p.Name AS PetName, p.Type, p.Weight)) AS Pets FROM animals_dataset.owners AS o JOIN animals_dataset.pets AS p ON o.OwnerID = p.OwnerID GROUP BY o.OwnerID, o.Name );

Expand the Explorer pane (left side panel) which lists the dataset and table, and then click on the table name owner_pets.
Click on the Schema tab to review the schema of the newly created table named owners_pets.

The schema includes a nested and repeated field named Pets, which contains the pet ID, name, type, and weight of each pet for each owner.

In a previous task, you ran a query that joined the owners and pets tables to count the number of pets for each owner. Now that the data is contained in a nested and repeated field, you can use the ARRAY_LENGTH function to return the number of pets for each owner.

In the query editor, run the following query to return the number of pets for each owner:

SELECT OwnerName, ARRAY_LENGTH(Pets) AS count FROM animals_dataset.owners_pets ORDER BY count DESC;

In the query editor, run the following query to create a logical view that returns only small pets (those 20 pounds or less):

CREATE OR REPLACE VIEW animals_dataset.small_pets AS ( SELECT * FROM animals_dataset.pets WHERE weight <= 20 );

Click Go to View.
In the Explorer pane, click View Actions (the icon with three vertical buttons) for the small_pets view, and select Query.
In the query editor, run the following query to examine the data in the view:

SELECT PetID, Weight FROM animals_dataset.small_pets;

In BigQuery, materialized views are precomputed views that periodically cache the results of a query for increased performance and efficiency. Materialized views can be especially useful for queries that require complex processing such as aggregations.

In the query editor, run the following query to create a materialized view that returns the total weight of each pet type:

CREATE OR REPLACE MATERIALIZED VIEW animals_dataset.pet_weight_by_type AS ( SELECT type, SUM(Weight) AS total_weight FROM animals_dataset.pets GROUP BY type );

Click Go to Materialized View.
In the Explorer pane, click View Actions (the icon with three vertical buttons) for the pet_weight_by_type materialized view, and select Query.
In the query editor, run the following query to examine the data in the materialized view:

SELECT type, total_weight FROM animals_dataset.pet_weight_by_type;

The canines have the highest total weight at 314 pounds.

Click Check my progress to verify the objective. Create new tables and views using DDL statements

Task 5. Define custom UDFs and stored procedures

In BigQuery, you can define a custom UDF for cases in which there is not a built-in function that already does what you need. A UDF accepts one or more columns of input, executes actions on the input, and returns the result of those actions as the output. In addition, you can also define stored procedures as functions that execute collections of SQL statements such as SELECT, INSERT, and more, in a desired order.

To learn more about defining UDFs and stored procedures in BigQuery, review the documentation titled User-defined functions and Work with SQL stored procedures.

In this task, you define UDFs and stored procedures to recalculate existing values in a table and to make it easier to add new data records to a table.

In the query editor, run the following code to create a UDF that converts pounds into kilograms:

CREATE OR REPLACE FUNCTION animals_dataset.PoundsToKilos(pounds FLOAT64) AS ( round(pounds / 2.2, 1) );

In the query editor, run the following query to test the UDF:

SELECT name, weight AS pounds, animals_dataset.PoundsToKilos(Weight) AS Kilos FROM animals_dataset.pets;

You can also create a stored procedure to make adding a new pet easier. The following procedure finds the largest pet ID in the pets table and adds a value of 1 to that ID. It then assigns that new value as the ID for the new Pet. When the new pet is added, the value of the new pet ID is returned.

In the query editor, run the following query to create a stored procedure to add a new pet:

CREATE OR REPLACE PROCEDURE animals_dataset.create_pet( customerID INT64, type STRING, name STRING, weight FLOAT64, out newPetID INT64) BEGIN SET newPetID = (SELECT MAX(PetID) + 1 FROM animals_dataset.pets); INSERT INTO animals_dataset.pets (PetID, OwnerID, Type, Name, Weight) VALUES(newPetID, customerID, type, name, weight); END

In the query editor, run the following query to test the stored procedure:

DECLARE newPetID INT64; CALL animals_dataset.create_pet(1, 'Dog', 'Duke', 15.0, newPetID); SELECT * FROM animals_dataset.pets WHERE PetID = newPetID;

Click View results for the last statement SELECT *.

Notice that the output variable of the function is the newly created record for a dog named Duke with a pet ID equal to 30.

In the query editor, run the following query to add another new pet:

DECLARE newPetID INT64; CALL animals_dataset.create_pet(4, 'Cat', 'Fluffy', 6.0, newPetID); SELECT * FROM animals_dataset.pets WHERE PetID = newPetID;

Click View results for the last statement SELECT *.

Notice that the ID field keeps incrementing in value; the new pet ID for a cat named Fluffy is 31.

In the query editor, run the following query to verify that the two new pets have been added to the pets table:

SELECT * FROM animals_dataset.pets WHERE Name in ('Duke', 'Fluffy');

Click Check my progress to verify the objective. Define custom UDFs and stored procedures

End your lab

When you have completed your lab, click End Lab. Google Cloud Skills Boost removes the resources you’ve used and cleans the account for you.

You will be given an opportunity to rate the lab experience. Select the applicable number of stars, type a comment, and then click Submit.

The number of stars indicates the following:

1 star = Very dissatisfied
2 stars = Dissatisfied
3 stars = Neutral
4 stars = Satisfied
5 stars = Very satisfied

You can close the dialog box if you don't want to provide feedback.

For feedback, suggestions, or corrections, please use the Support tab.

Copyright 2022 Google LLC All rights reserved. Google and the Google logo are trademarks of Google LLC. All other company and product names may be trademarks of the respective companies with which they are associated.

BigQuery Fundamentals for Oracle Professionals

Working with SQL in BigQuery for Oracle Professionals

Overview

Objectives

Setup and requirements

How to start your lab and sign in to the Console

Task 1. Create BigQuery dataset and tables using DDL statements

Task 2. Update BigQuery table data using DML statements

Task 3. Join data and write CTEs using SQL SELECT statements

Task 4. Create new tables and views using DDL statements

Task 5. Define custom UDFs and stored procedures

End your lab

Before you begin

Use private browsing

Sign in to the Console

Use private browsing to run the lab