Skip to content

Make your first DAG



# import libraries

from datetime import datetime, timedelta
from airflow import DAG
from airflow.operators.bash import BashOperator

#define the default arguments
default_args ={
    'owner' : 'ivy',
    'retries': 3,
    'retry_delay' : timedelta(minutes=5),
    #'depends_on_past': False,
    #'email': [''],
    #'email_on_failure': True,
    #'email_on_retry': False,
    #'catchup': False,

# define the DAG
with DAG(
    dag_id = 'Simple_dag_illustration_v1',
    default_args = default_args, 
    description = 'this is simple dag illustration',
    start_date = datetime(2024,3,1),
    schedule_interval=  '@daily',

) as dag:

# define your task(s)
    task1 = BashOperator(
        task_id = 'First_task_I_buy_grocery',
        bash_command= "echo ---the first step I buy food---",
    task2 = BashOperator(
        task_id = 'Second_task_I_cook',
        bash_command= "echo ---the second step I cook---"
    task3 = BashOperator(
        task_id = 'Third_task_I_eat',
        bash_command="echo ---the third step I eat---"
    task4 = BashOperator(
        task_id = 'Final_task_I_clean',
        bash_command="echo ---the last step I wash dishes"

# manage the logic order of your tasks
    task1 >> task2 >> task3>> task4

The full version of args in default_args typically includes the following attributes:

  • owner: The owner of the DAG, usually the username or email address of the person responsible for maintaining the DAG.
  • depends_on_past: A boolean value indicating whether a task instance should depend on the previous task’s instance to succeed.
  • start_date: The start date of the DAG or the first task instance. This can be a specific date and time or a timedelta object relative to the current time.
  • email: An email address to receive notifications related to the DAG.
  • email_on_failure: A boolean value indicating whether to send email notifications on task failures.
  • email_on_retry: A boolean value indicating whether to send email notifications on task retries.
  • retries: The number of retries to perform for failed tasks.
  • retry_delay: The delay between retries for failed tasks.
  • catchup: A boolean value indicating whether to backfill or catch up with the historical schedule for the DAG.