Here’s a guide to setting up Apache Airflow with Docker on a Linux machine, with shared DAGs and plugins folder, extra plugins, specific Python packages on Airflow workers, and a specified Airflow version:
- Install Docker and Docker Compose
First, ensure you have Docker and Docker Compose installed on your Linux machine. If not, follow the official installation guides:
- Docker: https://docs.docker.com/engine/install/
- Docker Compose: https://docs.docker.com/compose/install/
- Create a project folder.
Create a folder on your local machine for the Airflow project:
mkdir ~/airflow-project
cd ~/airflow-project
- Create a
Dockerfile
Create a Dockerfile
in the ~/airflow-project
folder with the following content:
FROM apache/airflow:1.10.3
# Install extra Python packages and plugins
RUN pip install package-name==package-version another-package-name==another-package-version
Replace package-name
and package-version
with the names and versions of the desired plugins or Python packages.
- Create a
docker-compose.yml
file
Create a docker-compose.yml
file in the ~/airflow-project
folder with the following content:
version: "3"
services:
postgres:
image: "postgres:9.6"
environment:
- POSTGRES_USER=airflow
- POSTGRES_PASSWORD=airflow
- POSTGRES_DB=airflow
webserver:
build: .
restart: always
depends_on:
- postgres
environment:
- LOAD_EX=n
- EXECUTOR=Celery
volumes:
- ./dags:/opt/airflow/dags
- ./plugins:/opt/airflow/plugins
ports:
- "8080:8080"
scheduler:
build: .
restart: always
depends_on:
- postgres
environment:
- LOAD_EX=n
- EXECUTOR=Celery
volumes:
- ./dags:/opt/airflow/dags
- ./plugins:/opt/airflow/plugins
worker:
build: .
restart: always
depends_on:
- postgres
environment:
- LOAD_EX=n
- EXECUTOR=Celery
volumes:
- ./dags:/opt/airflow/dags
- ./plugins:/opt/airflow/plugins
flower:
image: apache/airflow:1.10.3
restart: always
command: "flower"
depends_on:
- postgres
environment:
- EXECUTOR=Celery
ports:
- "5555:5555"
- Initialize the Airflow database
Run the following command in the ~/airflow-project
folder:
docker-compose run --rm webserver initdb
- Start the Airflow services
Run the following command in the ~/airflow-project
folder:
docker-compose up -d
Airflow should run with the specified version, shared DAGs and plugins folder, extra plugins, and specific Python packages on workers. You can access the Airflow web interface at http://localhost:8080.
You can update the DAGs and plugins in the shared folders on your local machine, and the changes will be reflected.