Skip to content

Make your first DAG

Prerequisite

Code

# import libraries

from datetime import datetime, timedelta
from airflow import DAG
from airflow.operators.bash import BashOperator

#define the default arguments
default_args ={
    'owner' : 'ivy',
    'retries': 3,
    'retry_delay' : timedelta(minutes=5),
    #'depends_on_past': False,
    #'email': ['airflow@example.com'],
    #'email_on_failure': True,
    #'email_on_retry': False,
    #'catchup': False,
}


# define the DAG
with DAG(
    dag_id = 'Simple_dag_illustration_v1',
    default_args = default_args, 
    description = 'this is simple dag illustration',
    start_date = datetime(2024,3,1),
    schedule_interval=  '@daily',
    #tags=['example','from_DAG']

) as dag:

# define your task(s)
    task1 = BashOperator(
        task_id = 'First_task_I_buy_grocery',
        bash_command= "echo ---the first step I buy food---",
                )
    
    task2 = BashOperator(
        task_id = 'Second_task_I_cook',
        bash_command= "echo ---the second step I cook---"
                )
    
    task3 = BashOperator(
        task_id = 'Third_task_I_eat',
        bash_command="echo ---the third step I eat---"
                )
    
    task4 = BashOperator(
        task_id = 'Final_task_I_clean',
        bash_command="echo ---the last step I wash dishes"
                )
    

    
# manage the logic order of your tasks
    task1 >> task2 >> task3>> task4

The full version of args in default_args typically includes the following attributes:

  • owner: The owner of the DAG, usually the username or email address of the person responsible for maintaining the DAG.
  • depends_on_past: A boolean value indicating whether a task instance should depend on the previous task’s instance to succeed.
  • start_date: The start date of the DAG or the first task instance. This can be a specific date and time or a timedelta object relative to the current time.
  • email: An email address to receive notifications related to the DAG.
  • email_on_failure: A boolean value indicating whether to send email notifications on task failures.
  • email_on_retry: A boolean value indicating whether to send email notifications on task retries.
  • retries: The number of retries to perform for failed tasks.
  • retry_delay: The delay between retries for failed tasks.
  • catchup: A boolean value indicating whether to backfill or catch up with the historical schedule for the DAG.