
Before you begin
- Labs create a Google Cloud project and resources for a fixed time
- Labs have a time limit and no pause feature. If you end the lab, you'll have to restart from the beginning.
- On the top left of your screen, click Start lab to begin
Create Bigtable instance
/ 10
Create Kubernetes Engine cluster
/ 10
Create ConfigMap
/ 20
Create OpenTSDB tables in Bigtable
/ 20
Deploy OpenTSDB
/ 10
Create OpenTSDB services
/ 10
Examining time-series data with OpenTSDB
/ 20
In this lab you will learn how to collect, record, and monitor time-series data on Google Cloud using OpenTSDB running on Google Kubernetes Engine and Cloud Bigtable.
Time-series data is a highly valuable asset that you can use for several applications, including trending, monitoring, and machine learning. You can generate time-series data from server infrastructure, application code, and other sources. OpenTSDB can collect and retain large amounts of time-series data with a high degree of granularity.
In this hands-on lab you will create a scalable data collection layer using Kubernetes Engine and work with the collected data using Bigtable. The following diagram illustrates the high-level architecture of the solution:
Read these instructions. Labs are timed and you cannot pause them. The timer, which starts when you click Start Lab, shows how long Google Cloud resources are made available to you.
This hands-on lab lets you do the lab activities in a real cloud environment, not in a simulation or demo environment. It does so by giving you new, temporary credentials you use to sign in and access Google Cloud for the duration of the lab.
To complete this lab, you need:
Click the Start Lab button. If you need to pay for the lab, a dialog opens for you to select your payment method. On the left is the Lab Details pane with the following:
Click Open Google Cloud console (or right-click and select Open Link in Incognito Window if you are running the Chrome browser).
The lab spins up resources, and then opens another tab that shows the Sign in page.
Tip: Arrange the tabs in separate windows, side-by-side.
If necessary, copy the Username below and paste it into the Sign in dialog.
You can also find the Username in the Lab Details pane.
Click Next.
Copy the Password below and paste it into the Welcome dialog.
You can also find the Password in the Lab Details pane.
Click Next.
Click through the subsequent pages:
After a few moments, the Google Cloud console opens in this tab.
Cloud Shell is a virtual machine that is loaded with development tools. It offers a persistent 5GB home directory and runs on the Google Cloud. Cloud Shell provides command-line access to your Google Cloud resources.
Click Activate Cloud Shell at the top of the Google Cloud console.
Click through the following windows:
When you are connected, you are already authenticated, and the project is set to your Project_ID,
gcloud
is the command-line tool for Google Cloud. It comes pre-installed on Cloud Shell and supports tab-completion.
Output:
Output:
gcloud
, in Google Cloud, refer to the gcloud CLI overview guide.
Enter the following commands in Cloud Shell to prepare your environment.
In your terminal, ensure your account is activated:
Paste the link in a new tab and follow the instructions using your student account. You will get a verification code to use for authentication.
Set the default Compute Engine zone to
You will be using Cloud Bigtable to store the time-series data that you collect. You must create a Bigtable instance to do that work.
Bigtable is a key/wide-column store that works especially well for time-series data, explained in Bigtable Schema Design for Time Series Data. Bigtable supports the HBase API, which makes it easy for you to use software designed to work with Apache HBase, such as OpenTSDB. You can learn about the HBase schema used by OpenTSDB in the OpenTSDB documentation.
A key component of OpenTSDB is the AsyncHBase client, which enables it to bulk-write to HBase in a fully asynchronous, non-blocking, thread-safe manner. When you use OpenTSDB with Bigtable, AsyncHBase is implemented as the AsyncBigtable client.
The ability to easily scale to meet your needs is a key feature of Bigtable. This lab uses a single-node development cluster because it is sufficient for the task and is economical. You should start your projects in a development cluster, moving to a larger production cluster when you are ready to work with production data. The Bigtable documentation includes detailed discussion about performance and scaling to help you pick a cluster size for your own work.
Now you will create your Bigtable instance.
Click Check my progress to verify the objective.
Kubernetes Engine provides a managed Kubernetes environment. After you create a Kubernetes Engine cluster, you can deploy Kubernetes pods to it. This Qwiklab uses Kubernetes Engine and Kubernetes pods to run OpenTSDB.
OpenTSDB separates its storage from its application layer, which enables it to be deployed across multiple instances simultaneously. By running in parallel, it can handle a large amount of time-series data. Packaging OpenTSDB into a Docker container enables easy deployment at scale using Kubernetes Engine.
Adding the two extra scopes to your Kubernetes cluster allows your OpenTSDB container to interact with Bigtable. You can pull images from Google Container Registry without adding a scope for Cloud Storage, because the cluster can read from Cloud Storage by default. You might need additional scopes in other deployments.
Click Check my progress to verify the objective.
To deploy and demonstrate OpenTSDB with a Bigtable storage backend, this guide uses a series of Docker container images that are deployed to GKE. You build several of these images using code from an accompanying GitHub repository with Cloud Build. When deploying infrastructure to GKE, a container repository is used. In this guide, you use Artifact Registry to manage these container images.
Two Docker container images are used in this lab. The first image is used for two purposes: to perform the one-time Bigtable database setup for OpenTSDB, and to deploy the read and write service containers for the OpenTSDB deployment. The second image is used to generate sample metric data to demonstrate your OpenTSDB deployment.
When you submit the container image build job to Cloud Build, you tag the images so that they are stored in the Artifact Registry after they are built.
Because you tagged the image appropriately, when the build is complete, the image will be managed by your Artifact Registry repository.
Kubernetes uses the ConfigMap to decouple configuration details from the container image in order to make applications more portable. The configuration for OpenTSDB is specified in the opentsdb.conf
file. A ConfigMap containing the opentsdb.conf
file is included with the sample code.
In this and following steps, you use the GNU envsubst
utility to replace environment variable placeholders in the YAML template files will the respective values for your deployment.
opentsdb-config.yaml
file:opentsdb.conf
ConfigMap and apply it to push the changes to the cluster. Some changes require you to restart processes.Click Check my progress to verify the objective.
Before you can read or write data using OpenTSDB, you need to create the necessary tables in Bigtable to store that data. Follow these steps to create a Kubernetes job that creates the tables.
The job can take up to a minute or more to complete.
The output should indicate 1 SUCCEEDED
under the heading, Pods Statuses
. Do not proceed until you see this status.
Click Check my progress to verify the objective.
The output is similar to the following:
The output lists each table that was created. This job runs several table creation commands, each using the format of create TABLE_NAME
. The tables are successfully created when you have output in the form of 0 row(s) in TIME seconds.
The tables you just created will store data points from OpenTSDB. In a later step, you will configure a test service to write time-series data into these tables. Time-series data points are organized and stored as follows:
Field |
Required |
Description |
Example |
|
Required |
Item that is being measured - the default key |
|
|
Required |
Epoch time of the measurement |
|
|
Required |
Measurement value |
|
|
At least one tag is required |
Qualifies the measurement for querying purposes |
|
The metric, timestamp, and tags (tag key and tag value) form the row key. The timestamp is normalized to one hour, to ensure that a row does not contain too many data points. For more information, see HBase Schema.
The rest of this Qwiklab provides instructions for making the sample scenario work. The following diagram shows the architecture you will use:
This Qwiklab uses two OpenTSDB Kubernetes deployments: one deployment sends metrics to Bigtable and the other deployment reads from it. Using two deployments prevents long-running reads and writes from blocking each other. The Pods in each deployment use the same container image. OpenTSDB provides a daemon called tsd that runs in each container.
A single tsd process can handle a high throughput of events per second. To distribute load, each deployment in this guide creates three replicas of the read and write Pods.
The configuration information for the write deployment is in the opentsdb-write.yaml.tpl
file in the deployments
folder of the guide repository.
The configuration information for the reader deployment is in the opentsdb-read.yaml.tpl
file in the deployments
folder of the guide repository.
opentsdb-read
and opentsdb-write
pods all have a status of Running
:In a production deployment, you can increase the number of tsd
Pods that are running, either manually or by using autoscaling in Kubernetes. Similarly, you can increase the number of instances in your GKE cluster manually or by using cluster autoscaler.
Click Check my progress to verify the objective.
In order to provide consistent network connectivity to the deployments, you will create two Kubernetes services. One service writes metrics into OpenTSDB and the other reads.
The configuration information for the metrics reading service is contained in opentsdb-write.yaml
in the services
folder of the example repository. This service is created inside your Kubernetes cluster and is reachable by other services running in your cluster.
This service is created inside your Kubernetes cluster and is accessible to other services running in your cluster. In the next section of this lab you write metrics to this service.
opentsdb-write
and opentsdb-read
services are running:You should see the opentsdb-write
and opentsdb-read
services listed:
Click Check my progress to verify the objective.
There are several mechanisms to write data into OpenTSDB. After you define service endpoints, you can direct processes to begin writing data to them. This guide deploys a Python service that emits demonstrative time-series data for two metrics: Cluster Memory Utilization (memory_usage_gauge
) and Cluster CPU Utilization (cpu_node_utilization_gauge
).
You can query time-series metrics by using the opentsdb-read
service endpoint that you deployed earlier. You can use the data in a variety of ways. One common option is to visualize it. OpenTSDB includes a basic interface to visualize metrics that it collects. This lab uses Grafana, a popular alternative for visualizing metrics that provides additional functionality.
Running Grafana in your cluster requires a similar process that you used to set up OpenTSDB. In addition to creating a ConfigMap and a deployment, you need to configure port forwarding so that you can access Grafana while it is running in your Kubernetes cluster.
grafana.yaml
file in the configmaps
folder of the guide repository:You should now see grafana-config
in the list of configmaps:
grafana.yaml
in the deployments
folder of the example repository:AVAILABLE
value for the grafana
deployment report as 1
:Click Check my progress to verify the objective.
A new browser tab opens and connects to the Grafana web interface. After a few moments, the browser displays graphs like this:
This deployment of Grafana has been customized for this lab. The files configmaps/grafana.yaml
and deployments/grafana.yaml
configure Grafana to:
opentsdb-read
serviceA deployment of Grafana in a production environment would implement the proper authentication mechanisms and use richer time-series graphs.
You have now successfully completed the Using OpenTSDB to Monitor Time-Series Data on Cloud Platform.
Continue your quest, check out these suggestions:
...helps you make the most of Google Cloud technologies. Our classes include technical skills and best practices to help you get up to speed quickly and continue your learning journey. We offer fundamental to advanced level training, with on-demand, live, and virtual options to suit your busy schedule. Certifications help you validate and prove your skill and expertise in Google Cloud technologies.
Manual Last Updated April 29, 2024
Lab Last Tested April 29, 2024
Copyright 2025 Google LLC All rights reserved. Google and the Google logo are trademarks of Google LLC. All other company and product names may be trademarks of the respective companies with which they are associated.
此内容目前不可用
一旦可用,我们会通过电子邮件告知您
太好了!
一旦可用,我们会通过电子邮件告知您
One lab at a time
Confirm to end all existing labs and start this one