Python 爬取迷你MP4电影网,电影名称保存到本地,Requests+lxml框架
先上图:
movie.py
import requests
from lxml import etree
for page in range(10):
url = "http://www.minimp4.com/movie/?page={}".format(page)
r = requests.get(url)
# print(r)
# print(r.text)
html = etree.HTML(r.text)
hrefs = html.xpath('//div[@class="meta"]/h1/a/@href')
# print(hrefs)
for ur in hrefs:
rr = requests.get(ur)
hhtml = etree.HTML(rr.text)
name = hhtml.xpath('//div[@class="movie-meta"]/h1/text()')
print(name[0])
with open('movie.txt','a',encoding='utf-8') as fp:
fp.write(name[0]+'\n')