简介

AirFlow 是采用 Python 编写的一个开源工作流调度器,它有一个丰富的UI。

安装

Python

1
2
3
4
aptitude install python
aptitude install python-dev
aptirude install python-pip
aptitude install libmysqlclient-dev

AirFlow

1
pip install airflow

Supervisor

1
aptitude install supervisor

配置

AirFlow

初始化

1
airflow initdb

添加用户登录

安装相应模块

1
pip install "airflow[password]"

添加配置

1
2
3
4
vim airflow.cfg
## 在 [webserver]下 加入
authenticate = True
auth_backend = airflow.contrib.auth.backends.password_auth

进入airflow目录下

1
2
cd ~/airflow
python

运行Python命令

1
2
3
4
5
6
7
8
9
10
11
12
import airflow
from airflow import models, settings
from airflow.contrib.auth.backends.password_auth import PasswordUser
user = PasswordUser(models.User())
user.username = 'user_name'
user.email = 'email@example.com'
user.password = 'password'
session = settings.Session()
session.add(user)
session.commit()
session.close()
exit()

Supervisord

加入 webserver 和 scheduler 启动管理

1
2
3
4
5
6
7
8
9
10
11
12
13
vim /etc/supervisor/conf.d/airflow.conf 

## 加入
[program:airflow_webserver]
command=airflow webserver
user=ubuntu
stderr_logfile=/var/log/airflow/webserver.err.log
stdout_logfile=/var/log/airflow/webserver.out.log
[program:airflow_scheduler]
command=airflow scheduler
user=ubuntu
stderr_logfile=/var/log/airflow/scheduler.err.log
stdout_logfile=/var/log/airflow/scheduler.out.log

问题

1
2
3
4
5
6
7
8
9
10
11
ImportError: No module named pidlockfile

## 解决方案

aptitude remove python-lockfile
pip install lockfile
ImportError: cannot import name MySqlOperator

## 解决方案

pip install airflow[celery]