
Before you begin
- Labs create a Google Cloud project and resources for a fixed time
- Labs have a time limit and no pause feature. If you end the lab, you'll have to restart from the beginning.
- On the top left of your screen, click Start lab to begin
Create the connection resource
/ 20
Set up access to a Cloud Storage data lake
/ 30
Create the BigLake table
/ 20
Create the external table
/ 10
Update external table to Biglake table
/ 20
BigLake is a unified storage engine that simplifies data access for data warehouses and lakes by providing uniform fine-grained access control across multi-cloud storage and open formats.
BigLake extends BigQuery's fine-grained row- and column-level security to tables on data resident object stores such as Amazon S3, Azure Data Lake Storage Gen2, and Google Cloud Storage. BigLake decouples access to the table from the underlying cloud storage data through access delegation. This feature helps you to securely grant row- and column-level access to users and pipelines in your organization without providing them full access to the table.
After you create a BigLake table, you can query it like other BigQuery tables. BigQuery enforces row- and column-level access controls, and every user sees only the slice of data that they are authorized to see. Governance policies are enforced on all access to the data through BigQuery APIs. For example, the BigQuery Storage API lets users access authorized data using open source query engines such as Apache Spark, as the following diagram shows:
In this lab, you will:
Read these instructions. Labs are timed and you cannot pause them. The timer, which starts when you click Start Lab, shows how long Google Cloud resources are made available to you.
This hands-on lab lets you do the lab activities in a real cloud environment, not in a simulation or demo environment. It does so by giving you new, temporary credentials you use to sign in and access Google Cloud for the duration of the lab.
To complete this lab, you need:
Click the Start Lab button. If you need to pay for the lab, a dialog opens for you to select your payment method. On the left is the Lab Details pane with the following:
Click Open Google Cloud console (or right-click and select Open Link in Incognito Window if you are running the Chrome browser).
The lab spins up resources, and then opens another tab that shows the Sign in page.
Tip: Arrange the tabs in separate windows, side-by-side.
If necessary, copy the Username below and paste it into the Sign in dialog.
You can also find the Username in the Lab Details pane.
Click Next.
Copy the Password below and paste it into the Welcome dialog.
You can also find the Password in the Lab Details pane.
Click Next.
Click through the subsequent pages:
After a few moments, the Google Cloud console opens in this tab.
Cloud Shell is a virtual machine that is loaded with development tools. It offers a persistent 5GB home directory and runs on the Google Cloud. Cloud Shell provides command-line access to your Google Cloud resources.
Click Activate Cloud Shell at the top of the Google Cloud console.
Click through the following windows:
When you are connected, you are already authenticated, and the project is set to your Project_ID,
gcloud
is the command-line tool for Google Cloud. It comes pre-installed on Cloud Shell and supports tab-completion.
Output:
Output:
gcloud
, in Google Cloud, refer to the gcloud CLI overview guide.
BigLake tables access Google Cloud Storage data using a connection resource. A connection resource can be associated with a single table or an arbitrary group of tables in the project.
From the Navigation Menu, go to BigQuery > BigQuery Studio. Click Done.
To create a connection, click +ADD, and then click Connections to external data sources.
In the Connection ID field, enter my-connection
.
For Location type, choose Multi-region and select US (multiple regions in United States) from dropdown.
Click Create connection.
To view your connection information, select the connection in the navigation menu.
Click Check my progress to verify the objective.
In this section, you will give the new connection resource read-only access to the Cloud Storage data lake so that BigQuery can access Cloud Storage files on behalf of users. We recommend that you grant the connection resource service account the Storage Object Viewer IAM role, which lets the service account access Cloud Storage buckets.
From the Navigation Menu, go to IAM & Admin > IAM.
Click +GRANT ACCESS.
In the New principals field, enter the service account ID that you copied earlier.
In the Select a role field, select Cloud Storage, and then select Storage Object Viewer.
Click Check my progress to verify the objective.
The following example uses the CSV file format, but you can use any format supported by BigLake, as shown in Limitations. If you're familiar with creating tables in BigQuery, then this process should be similar. The only difference is that you specify the associated cloud resource connection.
If no schema was provided and the service account was not granted access to the bucket in the previous step, this step will fail with an access denied message.
Navigate back to BigQuery > BigQuery Studio.
Click the three dots next to your project name and select Create dataset.
For the Dataset ID, use demo_dataset
.
For Location type, choose Multi-region and select US (multiple regions in United States) from dropdown.
Leave the rest of the fields as default and click Create Dataset.
Now that you have a dataset created, you can copy an existing dataset from Cloud Storage into BigQuery.
Click Browse to select the dataset. Navigate to the bucket named as customer.csv
file to import it into BigQuery, and click Select.
Under Destination, verify your lab project has been selected and you're using the demo_dataset.
For the table name, use biglake_table
.
Set the table type to External Table.
Select the box to Create a BigLake table using a Cloud Resource connection.
Verify that your connection ID us.my-connection is selected. Your configuration should resemble the following:
Click Check my progress to verify the objective.
Now that you've created the BigLake table, you can use any BigQuery client to submit a query.
From the biglake_table preview toolbar, click Query > In new tab.
Run the following to query the BigLake table through the BigQuery Editor:
Click Run.
Verify you can see all of the columns and data in the resulting table.
Once a BigLake table has been created, it can be managed in a similar fashion to BigQuery tables. To create access control policies for BigLake tables, you'll first create a taxonomy of policy tags in BigQuery. Then, apply the policy tags to the sensitive rows or columns. In this section, you will create a column level policy. For directions on setting up row-level security, see the row-level security guide.
For these purposes, a BigQuery taxonomy named
You will now use the policy tag you created to restrict access to certain columns within the BigQuery table. For this example, you will restrict access to sensitive information such as address, postal code, and phone number.
From the Navigation Menu, go to BigQuery > BigQuery Studio.
Navigate to demo-dataset > biglake_table and click the table to open the table schema page.
Click Edit Schema.
Check the boxes next to the address, postal_code, and phone fields.
Click Add policy tag.
Click
Click Select.
Your columns should now have the policy tags attached to them.
Click Save.
Verify your table schema now resembles the following.
Open the query editor for the biglake_table.
Run the following to query the BigLake table through the BigQuery Editor:
Click Run.
You should receive an error access denied error:
The query should run without any issues and return the columns you have access to. This example shows that column level security enforced through BigQuery can also be applied to BigLake tables.
You can upgrade existing tables to BigLake tables by associating the existing table to a cloud resource connection. For a complete list of flags and arguments, see bq update
and bq mkdef
.
Click three dots next to demo_dataset, then choose Create table.
Under Source for Create table from, choose Google Cloud Storage.
Click Browse to select the dataset. Navigate to the bucket named invoice.csv
file to import it into BigQuery, and click Select.
Under Destination, verify your lab project has been selected and you're using the demo_dataset.
For the table name, use external_table
.
Set the table type to External Table.
Click Check my progress to verify the objective.
Click Check my progress to verify the objective.
From the Navigation Menu, go to BigQuery > BigQuery Studio.
Navigate to demo-dataset > and double click external_table.
Open the Details tab.
Verify under External Data Configuration that the table is now using the proper Connection ID.
Great! You successfully upgraded the existing external table to a BigLake table by associating it to a cloud resource connection.
In this lab you created a connection resource, set up access to a Cloud Storage data lake, and created a BigLake table from it. You then queried the BigLake table through BigQuery, and set up column level access control policies. Lastly, you updated an existing external table to a BigLake table using the connection resource.
Be sure to check out the following documentation for more practice with BigLake:
...helps you make the most of Google Cloud technologies. Our classes include technical skills and best practices to help you get up to speed quickly and continue your learning journey. We offer fundamental to advanced level training, with on-demand, live, and virtual options to suit your busy schedule. Certifications help you validate and prove your skill and expertise in Google Cloud technologies.
Manual Last Updated January 16, 2024
Lab Last Tested January 16, 2024
Copyright 2025 Google LLC All rights reserved. Google and the Google logo are trademarks of Google LLC. All other company and product names may be trademarks of the respective companies with which they are associated.
This content is not currently available
We will notify you via email when it becomes available
Great!
We will contact you via email if it becomes available
One lab at a time
Confirm to end all existing labs and start this one