python实用技巧任务切分

今天来说说，Python中的任务切分。以爬虫为例，从一个存 url 的 txt 文件中，读取其内容，我们会获取一个 url 列表。我们把这一个 url 列表称为大任务。

python实用技巧任务切分
列表切分
在不考虑内存占用的情况下，我们对上面的大任务进行一个切分。比如我们将大任务切分成的小任务是每秒最多只访问5个URL。
import os
import time
CURRENT_DIR = os.path.dirname(os.path.abspath(__file__))
def read_file():
file_path = os.path.join(CURRENT_DIR, "url_list.txt")
with open(file_path, "r", encoding="utf-8") as fs:
result = [i.strip() for i in fs.readlines()] return result
def fetch(url):
print(url)
def run():
max_count = 5
url_list = read_file()
for index in range(0, len(url_list), max_count):
start = time.time() fetch(url_list[index:index + max_count]) end = time.time() - start
if end < 1:
time.sleep(1 - end)
if __name__ == '__main__':
run()

快速学习python基础
http://www.makeru.com.cn/live/5286_1688.html?s=148349
关键代码都在for循环里，首先我们通过声明range的第三个参数，该参数指定迭代的步长为5,这样每次index增加都是以5为基数，即0，5，10。。。
然后我们对url_list做切片，每次取其五个元素，这五个元素会随着index的增加不断的在改变，如果最后不够五个了，按照切片的特性这个时候就会有多少取多少了，不会造成索引超下标的问题

python 人工智能-神经网络
http://www.makeru.com.cn/live/5020_1669.html?s=148349

入门爬虫数据提取
http://www.makeru.com.cn/live/5286_1689.html?s=148349

python实用技巧任务切分

相关推荐