爬虫练习day1 request+bs4 爬取网络动画图片

Import requests

url = “”

response = response.get(url)

print(response) response [200] 则请求成功

若出现response418 错误代码

是因为触发了反爬

添加：

headers={'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3945.88 Safari/537.36'}

response = requests.get(url, headers=headers)

#指定解码用的编码集

Response.encoding = ‘utf-8’

#把源代码交给beautifulsoup

main_page = BeautifulSoup(response.text, “html.parser”) #“html.parser”为固定写法

beautifulsoup：

find(标签， attrs={“属性”:”值”}) 找一个

find_all(标签， attrs={“属性”:”值”}) 找全部

#find(“张三”, attrs = {“身高”:”180”})

Find(“div”,attrs={“class”:”b”})

执行

f = open("%s.jpg" % title, mode='wb')

f.write(requests.get(img.get("src")).content)

出现OSError: [Errno 22] Invalid argument: '\n萤火之森动漫图片萤火之森卡通图片\n.jpg'

解决方法：

title = title.replace('\n','')

成功后：

爬虫练习day1 request+bs4 爬取网络动画图片

爬虫练习day1 request+bs4 爬取网络动画图片

相关推荐