In the dynamic world of data engineering and workflow orchestration, organizations are increasingly migrating from legacy enterprise schedulers like Control-M to the open-source powerhouse, Apache Airflow. This transition often involves a complex and time-consuming process of converting existing job definitions. DAGify in a Cloud Composer environment accelerates and simplifies Control-M workflows to a cloud native environment, which allows the focus to be on optimizing workflows, and building and orchestrating data pipelines in the new environment.
DAGify is an open-source solution to automate the conversion of Control-M XML files into Airflow's native DAG format. It's also a migration accelerator as DAGify significantly reduces the manual effort, and therefore potential errors associated with the transition to Airflow.
In the flow below, Dagify converts XML files from a legacy Enterprise scheduler into Python DAG files. This expedites the transition to Airflow, in this case Airflow on Cloud Composer.
Cloud Composer offers organizations a fully managed Airflow experience. It eliminates the complexities of managing Airflow infrastructure.
In this lab, you learn about using DAGify to convert Control-M export files into a Python native DAG and running migrated DAGs in a Cloud Composer 3 environment.
What you'll learn
In this lab, you learn how to perform the following tasks:
Create a Cloud Composer 3 environment
Download and configure DAGify
Use DAGify to convert Control-M export files into a Python native DAG
Deploy DAGs to the Cloud Composer environment
Make modifications to the DAGify configuration and templates
Setup and requirements
Before you click the Start Lab button
Read these instructions. Labs are timed and you cannot pause them. The timer, which starts when you click Start Lab, shows how long Google Cloud resources are made available to you.
This hands-on lab lets you do the lab activities in a real cloud environment, not in a simulation or demo environment. It does so by giving you new, temporary credentials you use to sign in and access Google Cloud for the duration of the lab.
To complete this lab, you need:
Access to a standard internet browser (Chrome browser recommended).
Note: Use an Incognito (recommended) or private browser window to run this lab. This prevents conflicts between your personal account and the student account, which may cause extra charges incurred to your personal account.
Time to complete the lab—remember, once you start, you cannot pause a lab.
Note: Use only the student account for this lab. If you use a different Google Cloud account, you may incur charges to that account.
How to start your lab and sign in to the Google Cloud console
Click the Start Lab button. If you need to pay for the lab, a dialog opens for you to select your payment method.
On the left is the Lab Details pane with the following:
The Open Google Cloud console button
Time remaining
The temporary credentials that you must use for this lab
Other information, if needed, to step through this lab
Click Open Google Cloud console (or right-click and select Open Link in Incognito Window if you are running the Chrome browser).
The lab spins up resources, and then opens another tab that shows the Sign in page.
Tip: Arrange the tabs in separate windows, side-by-side.
Note: If you see the Choose an account dialog, click Use Another Account.
If necessary, copy the Username below and paste it into the Sign in dialog.
{{{user_0.username | "Username"}}}
You can also find the Username in the Lab Details pane.
Click Next.
Copy the Password below and paste it into the Welcome dialog.
{{{user_0.password | "Password"}}}
You can also find the Password in the Lab Details pane.
Click Next.
Important: You must use the credentials the lab provides you. Do not use your Google Cloud account credentials.
Note: Using your own Google Cloud account for this lab may incur extra charges.
Click through the subsequent pages:
Accept the terms and conditions.
Do not add recovery options or two-factor authentication (because this is a temporary account).
Do not sign up for free trials.
After a few moments, the Google Cloud console opens in this tab.
Note: To access Google Cloud products and services, click the Navigation menu or type the service or product name in the Search field.
Task 1. Create a Cloud Composer environment
In this task, you create a Cloud Composer 3 environment.
On the Google Cloud console title bar, in the Search field, type cloud composer, and then click Composer.
On the Environments page, for Create environment, select Composer 3.
For Name, type .
For Location, select .
Scroll to the bottom and click Show Advanced configuration.
For Environment bucket, select Custom bucket.
For Bucket name, click Browse, select , and then click Select.
Leave all other fields as default.
Click Create to create the environment.
NOTE: It can take up to 20 minutes to create the environment. You can continue with the lab and return to recheck this task before you deploy DAG to Cloud Composer.
Click Check my progress to verify the objective.
Create a Cloud Composer Environment
Task 2. Download and configure DAGify
In this task, you download and set up the DAGify tool.
In the console, on the Navigation menu (), click Compute Engine > VM instances.
For the instance named lab-setup, in the Connect column, click SSH to open an SSH-in-browser terminal window.
In the terminal window, run the following command to clone the DAGify repository:
Navigate to the folder dagify and execute the command below:
cd ~/dagify
make dagify-clean
Wait a minute for the Python Packages to be installed.
Once completed, activate the Python virtual environment by running the command:
source venv/bin/activate
Verify DAGify setup by running the following command:
python ./DAGify.py -h
Output:
Usage: DAGify.py [OPTIONS]
Run dagify.
Options:
-s, --source-path TEXT Path to source files for conversion [default:
(./source)]
-o, --output-path TEXT Path to output files after conversion. [default:
(./output)]
-c, --config-file TEXT Path to dagify configuration file. [default:
(./config.yaml)]
-t, --templates TEXT Path to dagify configuration file. [default:
(./dagify/templates)]
-d, --dag-divider TEXT Which field in Job Definition should be used to
divide up DAGS. [default: (PARENT_FOLDER)]
-r, --report Generate report in txt and json format which
gives an overview of job_types converted
-h, --help Show this message and exit.
Click Check my progress to verify the objective.
Download and configure DAGify
Task 3. Run the DAGify command with sample data in default mode
In this task, you use DAGify in default mode to convert the sample Control-M export file in XML format into a Python native DAG.
In the lab-setup terminal, view the Control-M job XML file 001-tfatf.xml by running the following command:
# Apache Airflow Base Imports
from airflow import DAG
from airflow.decorators import task
from airflow.sensors.external_task import ExternalTaskMarker
from airflow.sensors.external_task import ExternalTaskSensor
import datetime
# Apache Airflow Custom & DAG/Task Specific Imports
from airflow.operators.bash import BashOperator
default_args = {
'owner': 'airflow',
}
with DAG(
dag_id="fx_fld_001",
start_date=datetime.datetime(2024, 1, 1),
schedule_interval="@daily", # TIMEFROM not found, default schedule set to @daily,
catchup=False,
) as dag:
# DAG Tasks
fx_fld_001_app_001_subapp_001_job_001 = BashOperator(
task_id="fx_fld_001_app_001_subapp_001_job_001",
bash_command="echo I am task A",
trigger_rule="all_success",
dag=dag,
)
...
Click Check my progress to verify the objective.
Run DAGify command with sample data in the default mode
Task 4. Deploy DAG to Cloud Composer
In this task, you check that the Cloud Composer environment is up and then deploy the converted DAG file into the environment.
In the console title bar, type cloud composer in the Search field and then click on Composer.
Validate that your Cloud Composer environment is up and running and ready for use.
Note: If the Cloud Composer environment is still being created, wait for the creation process to complete.
From the environment list, click , to open the Environment details page.
Click Open DAGs folder to access the DAGs folder in the Cloud Storage bucket .
Here you can see the example default DAG airflow_monitoring.py that is issued when a composer environment is created.
Return to the terminal window.
Upload the new Python DAG fx_fld_001.py to the Cloud Storage bucket associated with Cloud Composer environment using the gcloud storage cp command:
gcloud storage cp -r ~/dagify/output/lab-output-task-3/001-tfatf/* gs://{{{primary_project.startup_script.bucket_name|filled in at lab start}}}/dags/lab-output-task-3/
Inspect that the Python DAG file has been uploaded into the correct storage bucket.
gcloud storage ls gs://{{{primary_project.startup_script.bucket_name | filled in at lab start}}}/dags/lab-output-task-3/*
Output:
gs://{{{primary_project.startup_script.bucket_name|filled in at lab start}}}/dags/lab-output-task-3/fx_fld_001.py
Go back to the Environment details page for environment.
Click Open Airflow UI.
Authenticate using user.
Verify that fx_fld_001 DAG is visible in the DAGs list.
NOTE: It can take up to 2-3 minutes for uploaded DAGs to Synchronize into Google Cloud Composer
Explore Cloud Composer
In this task you explore the detailed view of fx_fld_001 in Cloud Composer.
Click fx_fld_001 to open the detailed view of the DAG.
Click Code to view the original converted Python source code.
Click Graph to view the graph of all tasks and dependencies.
Click Check my progress to verify the objective.
Deploy DAG to Cloud Composer
Task 5. Run the DAGify command with sample data using a DAG divider
In this task, you use DAGify with the DAG divider flag ( -d) to divide control-m workflow into multiple DAG files based on the XML Key APPLICATION.
Switch to the lab-setup terminal window.
View Control-M job XML file 002-tftf.xml by running the command:
# Apache Airflow Base Imports
from airflow import DAG
from airflow.decorators import task
from airflow.sensors.external_task import ExternalTaskMarker
from airflow.sensors.external_task import ExternalTaskSensor
import datetime
# Apache Airflow Custom & DAG/Task Specific Imports
from airflow.operators.bash import BashOperator
default_args = {
'owner': 'airflow',
}
with DAG(
dag_id="fx_fld_001_app_001",
start_date=datetime.datetime(2024, 1, 1),
schedule_interval="@daily", # TIMEFROM not found, default schedule set to @daily,
catchup=False,
) as dag:
# DAG Tasks
fx_fld_001_app_001_subapp_001_job_001 = BashOperator(
task_id="fx_fld_001_app_001_subapp_001_job_001",
bash_command="",
trigger_rule="all_success",
dag=dag,
)
...
Deploy the converted Python DAG files by running the gcloud storage cp command to move them to the Cloud Storage bucket associated with the Cloud Composer environment:
gcloud storage cp -r ~/dagify/output/lab-output-task-5/002-tftf/* gs://{{{primary_project.startup_script.bucket_name|filled in at lab start}}}/dags/lab-output-task-5/
Verify that the Python DAG files are uploaded into the correct storage bucket by running the command:
gcloud storage ls gs://{{{primary_project.startup_script.bucket_name | filled in at lab start}}}/dags/lab-output-task-5/*
Output:
gs://{{{primary_project.startup_script.bucket_name|filled in at lab start}}}/dags/lab-output-task-5/fx_fld_001_app_001.py
gs://{{{primary_project.startup_script.bucket_name|filled in at lab start}}}/dags/lab-output-task-5/fx_fld_001_app_002.py
Go back to the Environment details page for environment.
Click Open Airflow UI.
Verify that fx_fld_001_app_001 and fx_fld_001_app_002 DAGs are visible in the DAGs list.
NOTE: It can take up to 2-3 minutes for uploaded DAGs to synchronize into Google Cloud Composer.
Click Check my progress to verify the objective.
Run DAGify command with sample data using a DAG divider
Congratulations!
You’ve successfully used DAGify to convert Control-M export files into a Python DAG and uploaded migrated DAGs in the Cloud Composer environment.
...helps you make the most of Google Cloud technologies. Our classes include technical skills and best practices to help you get up to speed quickly and continue your learning journey. We offer fundamental to advanced level training, with on-demand, live, and virtual options to suit your busy schedule. Certifications help you validate and prove your skill and expertise in Google Cloud technologies.
Manual Last Updated November 11, 2024
Lab Last Tested November 7, 2024
Copyright 2025 Google LLC. All rights reserved. Google and the Google logo are trademarks of Google LLC. All other company and product names may be trademarks of the respective companies with which they are associated.
Labs create a Google Cloud project and resources for a fixed time
Labs have a time limit and no pause feature. If you end the lab, you'll have to restart from the beginning.
On the top left of your screen, click Start lab to begin
Use private browsing
Copy the provided Username and Password for the lab
Click Open console in private mode
Sign in to the Console
Sign in using your lab credentials. Using other credentials might cause errors or incur charges.
Accept the terms, and skip the recovery resource page
Don't click End lab unless you've finished the lab or want to restart it, as it will clear your work and remove the project
This content is not currently available
We will notify you via email when it becomes available
Great!
We will contact you via email if it becomes available
One lab at a time
Confirm to end all existing labs and start this one
Use private browsing to run the lab
Use an Incognito or private browser window to run this lab. This
prevents any conflicts between your personal account and the Student
account, which may cause extra charges incurred to your personal account.
This lab focuses on using DAGify to convert Control-M export files into a Python Native DAG and upload migrated DAGs in Cloud Composer environment.