Here’s a guide to setting up Apache Airflow with Docker on a Linux machine, with shared DAGs and plugins folder, extra plugins, specific Python packages on Airflow workers, and a specified Airflow version:

  1. Install Docker and Docker Compose

First, ensure you have Docker and Docker Compose installed on your Linux machine. If not, follow the official installation guides:

  1. Create a project folder.

Create a folder on your local machine for the Airflow project:

mkdir ~/airflow-project
cd ~/airflow-project
  1. Create a Dockerfile

Create a Dockerfile in the ~/airflow-project folder with the following content:

FROM apache/airflow:1.10.3

# Install extra Python packages and plugins
RUN pip install package-name==package-version another-package-name==another-package-version

Replace package-name and package-version with the names and versions of the desired plugins or Python packages.

  1. Create a docker-compose.yml file

Create a docker-compose.yml file in the ~/airflow-project folder with the following content:

version: "3"
services:
  postgres:
    image: "postgres:9.6"
    environment:
      - POSTGRES_USER=airflow
      - POSTGRES_PASSWORD=airflow
      - POSTGRES_DB=airflow

  webserver:
    build: .
    restart: always
    depends_on:
      - postgres
    environment:
      - LOAD_EX=n
      - EXECUTOR=Celery
    volumes:
      - ./dags:/opt/airflow/dags
      - ./plugins:/opt/airflow/plugins
    ports:
      - "8080:8080"

  scheduler:
    build: .
    restart: always
    depends_on:
      - postgres
    environment:
      - LOAD_EX=n
      - EXECUTOR=Celery
    volumes:
      - ./dags:/opt/airflow/dags
      - ./plugins:/opt/airflow/plugins

  worker:
    build: .
    restart: always
    depends_on:
      - postgres
    environment:
      - LOAD_EX=n
      - EXECUTOR=Celery
    volumes:
      - ./dags:/opt/airflow/dags
      - ./plugins:/opt/airflow/plugins

  flower:
    image: apache/airflow:1.10.3
    restart: always
    command: "flower"
    depends_on:
      - postgres
    environment:
      - EXECUTOR=Celery
    ports:
      - "5555:5555"
  1. Initialize the Airflow database

Run the following command in the ~/airflow-project folder:

docker-compose run --rm webserver initdb
  1. Start the Airflow services

Run the following command in the ~/airflow-project folder:

docker-compose up -d

Airflow should run with the specified version, shared DAGs and plugins folder, extra plugins, and specific Python packages on workers. You can access the Airflow web interface at http://localhost:8080.

You can update the DAGs and plugins in the shared folders on your local machine, and the changes will be reflected.

Leave a Reply