

Task_group_example = task_group_example ( )įrom airflow. Load ( transform_values ( extract_data ( ) ) ) datetime ( 2021, 1, 1, tz = "UTC" ), catchup = False )ĭef task_group_example ( ) ( task_id = "extract", retries = 2 )ĭata_string = ' """ Import ( schedule = None, start_date = pendulum. decorators import dag, task, task_group To use the decorator, add before a Python function which calls the functions of tasks that should go in the task group.
#AIRFLOW 2.0 TASK GROUPS CODE#
Using task group decorators doesn't change the functionality of task groups, but they can make your code formatting more consistent if you're already using them in your DAGs. The task group decorator functions like other Airflow decorators and allows you to define your task group with the TaskFlow API. The task group decorator is available in Airflow 2.1 and later. Use the task group decorator Īnother way of defining task groups in your DAGs is by using the task group decorator. It is important that you use this format when calling specific tasks with XCOM passing or branching operator decisions. This ensures the task_id is unique across the DAG. When your task is within a task group, your callable task_id is the task_id prefixed with the group_id. The task group dependencies are shown in the following animation: The task immediately to the right of the first blue circle ( t1) gets the group's upstream dependencies and the task immediately to the left ( t2) of the last blue circle gets the group's downstream dependencies. When you click and expand group1, blue circles identify the task group dependencies. Stay tuned for that, and I’ll make sure to publish the article in a couple of days.In the Airflow UI, blue highlighting is used to identify tasks and task groups. In the following article, we’ll take a deep dive into Airflow Xcoms, which is a method of sending data between the tasks. Most of the time you don’t need to run similar tasks one after the other, so running them in parallel is a huge time saver. It’s a huge milestone, especially because you can be more efficient now. Today you’ve successfully written your first Airflow DAG that runs the tasks in parallel. That’s all I wanted to cover today, so let’s wrap things up next. Image 9 - Airflow DAG runtime in the Gantt view (image by author)īars representing the runtimes are placed on top of each other, indicating the tasks have indeed run in parallel. The best indicator is, once again, the Gantt view: Image 8 - Inspecting the running DAG (image by author) Trigger the DAG once again and inspect the Tree view - you’ll see that the tasks have started running at the same time: The start task will now run first, followed by the other four tasks that connect to the APIs and run in parallel. Image 7 - DAG view showing the tasks will run in parallel (image by author) You can see how the Graph view has changed: Let’s write it above the current first task: To start, we’ll need to write another task that basically does nothing, but it’s here only so we can connect the other tasks to something. Let’s go back to the code editor and modify the DAG so the tasks run in parallel. Image 6 - Airflow DAG runtime in the Gantt view (image by author) Image 5 - Airflow DAG running tasks sequentially (image by author)īut probably the best confirmation is the Gantt view that shows the time each task took: Running the DAG confirms the tasks are running sequentially : It’s a huge waste of time since the GET requests aren’t connected in any way. You can see that the tasks are connected in a sequential manner - one after the other.

Image 4 - Tasks of the Airflow DAG connected sequentially (image by author) Here’s what it looks like in the Graph view: Open up the Airflow webserver page and open our new DAG. That’s all we need for now, so let’s test the DAG through the Airflow homepage next. Image 3 - Saved users in JSON format (image by author) The task execution succeeded, and here’s what it saved to the data folder: Image 2 - Testing an Airflow task through Terminal (image by author) Airflow tasks test parallel_dag get_users 2022 - 3 - 1
