返回组结果时返回Json错误的芹菜任务

问题描述:

我的工作流程有点复杂,但我希望有人能从下面的解释或代码中理解它。返回组结果时返回Json错误的芹菜任务

基本上,我刮了公司的网站/目录。当查询通过时,它将返回公司的迷你简介,即每页50家公司。使用芹菜,我试图使用一组任务从全部搜索结果页面中获取所有公司。该工作流程如下:

  1. 获得所有10页的所有企业(每页50家公司) 组(process_ali.s(网址,查询)将在URL网址)()。在这种情况下, urls == 10和url将有50个公司
  2. 这意味着我有一个外部列表,其中包含每个页面的列表 结果。每个结果是字典
  3. 基(company_worker.s(ⅰ),其中i在res)()步骤1的结果是 过程作为一组
  4. 注i被包含每个页面结果company_work列表处理 这个列表也作为一个组通过调用另一个组。

    ​​

这是我如何调用从Python解释器的任务。

>>> from b2b.tasks import * 
>>> from pprint import pprint 
>>> from celery import shared_task, group, task, chain, chord 
>>> from celery.task.sets import subtask 
>>> base_url = "http://ebay.com" 
>>> query = "bag" 
>>> res = process_site.s(base_url, query)() 

http://www.ebay.com/company/bag/-50/1.html 
http://www.ebay.com/company/bag/-50/2.html 
http://www.ebay.com/company/bag/-50/3.html 
http://www.ebay.com/company/bag/-50/4.html 
http://www.ebay.com/company/bag/-50/5.html 
... 

回溯我立刻得到上面的URL列表后...

Traceback (most recent call last): 
    File "<console>", line 1, in <module> 
    File "/Users/Me/.virtualenvs/djangoscrape/lib/python2.7/site-packages/celery/canvas.py", line 172, in __call__ 
    return self.type(*args, **kwargs) 
    File "/Users/Me/.virtualenvs/djangoscrape/lib/python2.7/site-packages/celery/app/task.py", line 420, in __call__ 
    return self.run(*args, **kwargs) 
    File "/Users/Me/projects/django_stuff/scraper/b2b/tasks.py", line 224, in process_site 
    all = group(company_worker.s(i) for i in res)() 
    File "/Users/Me/.virtualenvs/djangoscrape/lib/python2.7/site-packages/celery/canvas.py", line 525, in __call__ 
    return self.apply_async(partial_args, **options) 
    File "/Users/Me/.virtualenvs/djangoscrape/lib/python2.7/site-packages/celery/canvas.py", line 504, in apply_async 
    add_to_parent=add_to_parent) 
    File "/Users/Me/.virtualenvs/djangoscrape/lib/python2.7/site-packages/celery/app/task.py", line 420, in __call__ 
    return self.run(*args, **kwargs) 
    File "/Users/Me/.virtualenvs/djangoscrape/lib/python2.7/site-packages/celery/app/builtins.py", line 172, in run 
    add_to_parent=False) for stask in taskit] 
    File "/Users/Me/.virtualenvs/djangoscrape/lib/python2.7/site-packages/celery/canvas.py", line 251, in apply_async 
    return _apply(args, kwargs, **options) 
    File "/Users/Me/.virtualenvs/djangoscrape/lib/python2.7/site-packages/celery/app/task.py", line 559, in apply_async 
    **dict(self._get_exec_options(), **options) 
    File "/Users/Me/.virtualenvs/djangoscrape/lib/python2.7/site-packages/celery/app/base.py", line 353, in send_task 
    reply_to=reply_to or self.oid, **options 
    File "/Users/Me/.virtualenvs/djangoscrape/lib/python2.7/site-packages/celery/app/amqp.py", line 305, in publish_task 
    **kwargs 
    File "/Users/Me/.virtualenvs/djangoscrape/lib/python2.7/site-packages/kombu/messaging.py", line 165, in publish 
    compression, headers) 
    File "/Users/Me/.virtualenvs/djangoscrape/lib/python2.7/site-packages/kombu/messaging.py", line 241, in _prepare 
    body) = dumps(body, serializer=serializer) 
    File "/Users/Me/.virtualenvs/djangoscrape/lib/python2.7/site-packages/kombu/serialization.py", line 164, in dumps 
    payload = encoder(data) 
    File "/usr/local/Cellar/python/2.7.10_2/Frameworks/Python.framework/Versions/2.7/lib/python2.7/contextlib.py", line 35, in __exit__ 
    self.gen.throw(type, value, traceback) 
    File "/Users/Me/.virtualenvs/djangoscrape/lib/python2.7/site-packages/kombu/serialization.py", line 59, in _reraise_errors 
    reraise(wrapper, wrapper(exc), sys.exc_info()[2]) 
    File "/Users/Me/.virtualenvs/djangoscrape/lib/python2.7/site-packages/kombu/serialization.py", line 55, in _reraise_errors 
    yield 
    File "/Users/Me/.virtualenvs/djangoscrape/lib/python2.7/site-packages/kombu/serialization.py", line 164, in dumps 
    payload = encoder(data) 
    File "/Users/Me/.virtualenvs/djangoscrape/lib/python2.7/site-packages/anyjson/__init__.py", line 141, in dumps 
    return implementation.dumps(value) 
    File "/Users/Me/.virtualenvs/djangoscrape/lib/python2.7/site-packages/anyjson/__init__.py", line 87, in dumps 
    return self._encode(data) 
    File "/Users/Me/.virtualenvs/djangoscrape/lib/python2.7/site-packages/simplejson/__init__.py", line 380, in dumps 
    return _default_encoder.encode(obj) 
    File "/Users/Me/.virtualenvs/djangoscrape/lib/python2.7/site-packages/simplejson/encoder.py", line 275, in encode 
    chunks = self.iterencode(o, _one_shot=True) 
    File "/Users/Me/.virtualenvs/djangoscrape/lib/python2.7/site-packages/simplejson/encoder.py", line 357, in iterencode 
    return _iterencode(o, 0) 
    File "/Users/Me/.virtualenvs/djangoscrape/lib/python2.7/site-packages/simplejson/encoder.py", line 252, in default 
    raise TypeError(repr(o) + " is not JSON serializable") 
EncodeError: <AsyncResult: 7838a203-a853-4755-992b-cfd67207d398> is not JSON serializable 
>>> 

发送到芹菜任务的参数必须是JSON序列化(例如,字符串列表字典等)所以最有可能的是其中一个不是的任务之一。