Celery 分布式任务队列快速入门
一、Celery介绍和基本使用
Celery 是一个 基于python开发的分布式异步消息任务队列,通过它可以轻松的实现任务的异步处理, 如果你的业务场景中需要用到异步任务,就可以考虑使用celery, 举几个实例场景中可用的例子:
①你想对100台机器执行一条批量命令,可能会花很长时间 ,但你不想让你的程序等着结果返回,而是给你返回 一个任务ID,你过一段时间只需要拿着这个任务id就可以拿到任务执行结果, 在任务执行ing进行时,你可以继续做其它的事情。
②你想做一个定时任务,比如每天检测一下你们所有客户的资料,如果发现今天 是客户的生日,就给他发个短信祝福
Celery优点:
简单:一单熟悉了celery的工作流程后,配置和使用还是比较简单的
高可用:当任务执行失败或执行过程中发生连接中断,celery 会自动尝试重新执行任务
快速:一个单进程的celery每分钟可处理上百万个任务
灵活: 几乎celery的各个组件都可以被扩展及自定制
1、Celery安装使用
$ pip install celery
Celery的默认broker是RabbitMQ, 仅需配置一行就可以
broker_url = 'amqp://guest:[email protected]:5672//'
2、使用Redis做broker
$ pip install celery
$ pip install redis
配置如下:
app.conf.broker_url = 'redis://localhost:6379/0' #无密码
app.conf.broker_url = 'redis://:[email protected]:port/db_number' #有密码
3、使用celery
3.1 创建一个celery application 用来定义你的任务列表,任务文件名为 tasks.py
#!/usr/bin/env python
#-*- coding:utf-8 -*-
from celery import Celery
import subprocess
app = Celery('tasks',
broker='redis://192.168.16.191:6380/0',
backend='redis://192.168.16.191:6380/0')
#bakend作用:接收命令执行结果
@app.task
def add(x, y):
print("running...", x, y)
return x + y
@app.task
def run_cmd():
return subprocess.Popen('df -h',shell=True,stdout=subprocess.PIPE).stdout.read().decode('utf-8')
3.2 启动Celery Worker来开始监听并执行任务
$ celery -A tasks worker --loglevel=info #日志级别有debug、info、warning等
3.3 生成和调用任务
打开一个终端, 进行命令行模式,调用任务
>>> import tasks
>>> result = tasks.add.delay(5,118)
>>> result.get()
123
>>>
>>> r2 = tasks.run_cmd.delay()
>>> r2.get()
#其他项目命令
result.ready() #结果是否准备好
result.get(timeout=1) #获取结果,设置超时
result.get(propagate=False) #不会导致程序出错的异常
result.traceback #告诉哪里出错,追踪错误信息
这时在redis中就会出现任务的key值,可以通过get 任务key值,来查看里面的信息
详细信息请参考官方文档:http://docs.celeryproject.org/en/latest/index.html
扩展:在python3.7中启动任务的时候报一下错误
File "/Users/li/.venv/venv-myprojet/lib/python3.7/site-packages/celery/backends/redis.py", line 22
from . import async, base
^
SyntaxError: invalid syntax
因为在python3.7中async名称更换了,解决方法:
pip3 install --upgrade https://github.com/celery/celery/tarball/master
这时再重新执行启动任务即可。
二、在项目中使用 celery
1、项目目录格式如下
proj/__init__.py
/celery.py
/tasks.py
2、proj/celery.py内容
#!/usr/bin/env python
#-*- coding:utf-8 -*-
from __future__ import absolute_import, unicode_literals
from celery import Celery
broker = 'redis://192.168.16.191:6380/1'
backend = 'redis://192.168.16.191:6380/1'
app = Celery('proj',
broker=broker,
backend=backend,
include=['proj.tasks'])
# Optional configuration, see the application user guide.
app.conf.update(
result_expires=3600,
)
if __name__ == '__main__':
app.start()
3、proj/tasks.py中的内容
#!/usr/bin/env python
#-*- coding:utf-8 -*-
from __future__ import absolute_import, unicode_literals
from .celery import app
@app.task
def add(x, y):
return x + y
@app.task
def mul(x, y):
return x * y
@app.task
def xsum(numbers):
return sum(numbers)
4、启动worker
$ celery -A proj worker -l info
5、调用方法
>>> from proj import tasks
>>> r = tasks.add.delay(3,4)
>>> r.get()
7
6、扩展
在后台启用多个celey任务(xxx为自己定义的名字)
#启动
celery multi start xxx1 -A proj -l info
celery multi start xxx2 -A proj -l info
celery multi start xxx3 -A proj -l info
#重启
celery multi restart xxx1 -A proj -l info
celery multi restart xxx2 -A proj -l info
celery multi restart xxx3 -A proj -l info
#停止
celery multi stop xxx1 -A proj
celery multi stop xxx2 -A proj
celery multi stop xxx3 -A proj
#异步停止
celery multi stopwait xxx1 -A proj
celery multi stopwait xxx2 -A proj
celery multi stopwait xxx3 -A proj
三、Celery 定时任务
celery支持定时任务,设定好任务的执行时间,celery就会定时自动帮你执行, 这个定时任务模块叫celery beat
1、创建 periodic_task.py(自定义) 文件
#!/usr/bin/env python
#-*- coding:utf-8 -*-
from celery import Celery
from celery.schedules import crontab
broker = 'redis://192.168.16.191:6380/1'
backend = 'redis://192.168.16.191:6380/1'
app = Celery('tasks',
broker=broker,
backend=backend)
@app.on_after_configure.connect
def setup_periodic_tasks(sender, **kwargs):
# 每10s执行test函数
sender.add_periodic_task(10.0, test.s('hello'), name='add every 10')
# 每30s执行test函数
sender.add_periodic_task(30.0, test.s('world'), expires=10)
# 每周一早上 7:30 执行test函数
sender.add_periodic_task(
crontab(hour=7, minute=30, day_of_week=1),
test.s('Happy Mondays!'),
)
@app.task
def test(arg):
print(arg)
1.1 上面是通过调用函数添加定时任务,也可以像写配置文件 一样的形式添加, 下面是每30s执行的任务
app.conf.beat_schedule = {
'add-every-30-seconds': {
'task': 'tasks.add',
'schedule': 30.0,
'args': (16, 16)
},
}
app.conf.timezone = 'UTC'
任务添加好了,需要让celery单独启动一个进程来定时发起这些任务, 注意, 这里是发起任务,不是执行,这个进程只会不断的去检查你的任务计划, 每发现有任务需要执行了,就发起一个任务调用消息,交给celery worker去执行
2、启动任务调度器 celery beat
$ celery -A periodic_task beat -l info
3、启动celery worker来执行任务
$ celery -A periodic_task worker -l info
备注:beat和work必须都要启动
4、更复杂的定时配置
from celery.schedules import crontab
app.conf.beat_schedule = {
# Executes every Monday morning at 7:30 a.m.
'add-every-monday-morning': {
'task': 'tasks.add',
'schedule': crontab(hour=7, minute=30, day_of_week=1),
'args': (16, 16),
},
}
更多定时配置方式如下:
例子 | 含义 |
---|---|
crontab() | Execute every minute. |
crontab(minute=0, hour=0) | Execute daily at midnight. |
crontab(minute=0, hour=’*/3’) | Execute every three hours: midnight, 3am, 6am, 9am, noon, 3pm, 6pm, 9pm. |
crontab(minute=0,hour=‘0,3,6,9,12,15,18,21’) | Same as previous. |
crontab(minute=’*/15’) | Execute every 15 minutes. |
crontab(day_of_week=‘sunday’) | Execute every minute (!) at Sundays. |
crontab(minute=’’,hour=’’,day_of_week=‘sun’) | Same as previous. |
crontab(minute=’*/10’,hour=‘3,17,22’,day_of_week=‘thu,fri’) | Execute every ten minutes, but only between 3-4 am, 5-6 pm, and 10-11 pm on Thursdays or Fridays. |
crontab(minute=0,hour=’/2,/3’) | Execute every even hour, and every hour divisible by three. This means: at every hour except: 1am, 5am, 7am, 11am, 1pm, 5pm, 7pm, 11pm |
crontab(minute=0, hour=’*/5’) | Execute hour divisible by 5. This means that it is triggered at 3pm, not 5pm (since 3pm equals the 24-hour clock value of “15”, which is divisible by 5). |
crontab(minute=0, hour=’*/3,8-17’) | Execute every hour divisible by 3, and every hour during office hours (8am-5pm). |
crontab(0, 0,day_of_month=‘2’) | Execute on the second day of every month. |
crontab(0, 0,day_of_month=‘2-30/3’) | Execute on every even numbered day. |
crontab(0, 0,day_of_month=‘1-7,15-21’) | Execute on the first and third weeks of the month. |
crontab(0, 0,day_of_month=‘11’,month_of_year=‘5’) | Execute on the eleventh of May every year. |
crontab(0, 0,month_of_year=’*/3’) | Execute on the first month of every quarter. |
详见: http://docs.celeryproject.org/en/latest/userguide/periodic-tasks.html#solar-schedules
4、更复杂的定时配置
from celery.schedules import crontab
app.conf.beat_schedule = {
# Executes every Monday morning at 7:30 a.m.
'add-every-monday-morning': {
'task': 'tasks.add',
'schedule': crontab(hour=7, minute=30, day_of_week=1),
'args': (16, 16),
},
}
四、django中使用celery
1、目录结构
- proj/
- manage.py
- proj/
- __init__.py
- settings.py
- urls.py
2、 文件 proj/proj/celery.py
from __future__ import absolute_import, unicode_literals
import os
from celery import Celery
# set the default Django settings module for the 'celery' program.
os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'proj.settings')
app = Celery('proj')
# Using a string here means the worker doesn't have to serialize
# the configuration object to child processes.
# - namespace='CELERY' means all celery-related configuration keys
# should have a `CELERY_` prefix.
app.config_from_object('django.conf:settings', namespace='CELERY')
# Load task modules from all registered Django app configs.
app.autodiscover_tasks()
@app.task(bind=True)
def debug_task(self):
print('Request: {0!r}'.format(self.request))
3、在 proj/proj/__init__.py 导入 app 模块,这个是确保在启动django项目的时候,确保app可以被加载
proj/proj/__init__.py 文件内容
from __future__ import absolute_import, unicode_literals
# This will make sure the app is always imported when
# Django starts so that shared_task will use this app.
from .celery import app as celery_app
__all__ = ('celery_app',)
4、通过 app.autodiscover_tasks() ,celery 将按照tasks.py约定,自动发现所有已安装应用程序中的任务
- app1/
- tasks.py
- models.py
- app2/
- tasks.py
- models.py
5、运用 @shared_task 装饰
您编写的任务可能会存在于可重用的应用程序中,并且可重用的应用程序不能依赖于项目本身,因此您也不能直接导入应用程序实例。
@shared_task 允许您在没有任何具体应用实例的情况下创建任务
demoapp/tasks.py:
# Create your tasks here
from __future__ import absolute_import, unicode_literals
from celery import shared_task
@shared_task
def add(x, y):
return x + y
@shared_task
def mul(x, y):
return x * y
@shared_task
def xsum(numbers):
return sum(numbers)
6、django views里调用celery task
from django.shortcuts import render,HttpResponse
# Create your views here.
from app01 import tasks
def task_test(request):
res = tasks.add.delay(228,24)
print("start running task")
print("async task res",res.get() )
return HttpResponse('res %s'%res.get())
7、在settings中配置celery缓存地址
CELERY_BROKER_URL="redis://192.168.16.191:6380/1"
CELERY_BRESULT_BACKEND="redis://192.168.16.191:6380/1"
8、扩展
如果在项目中使用的时候,需要结合ajax,首先通过请求一个url获取到task_id,前端根据task_id来请求数据,从而实现异步效果
from django.shortcuts import render, HttpResponse
from celery.result import AsyncResult
from crm import tasks
def task_test(request):
#返回task_id,在url中定义路由关系
res = tasks.add.delay(228, 24)
print("start running task")
return HttpResponse(res.task_id)
# print("async task res", res.get())
# return HttpResponse('res %s' % res.get())
def task_res(request):
#通过前端传过来的tak_id,来请求结果,在url中定义路由关系
result = AsyncResult(id='xxxxxxxxx')
if result.status == 'SUCCESS':
return HttpResponse(result.get())
详见:http://docs.celeryproject.org/en/latest/django/first-steps-with-django.html#using-celery-with-django
五、在django中使用计划任务功能
1、安装 django-celery-beat 模块
$ pip install django-celery-beat
2、在setting.py 中配置 django_celery_beat 模块到 INSTALLED_APPS
INSTALLED_APPS = (
...,
'django_celery_beat',
)
3、创建数据库
$ python manage.py migrate
4、启动celery beat
$ celery -A proj beat -l info -S django
5、访问django-admin
配置完长这样:
此时启动你的celery beat 和worker,会发现每隔2分钟,beat会发起一个任务消息让worker执行scp_task任务