python requests添加cookies模拟登陆爬取网页

从浏览器中拿到cookies添加到header中即可,cookies信息可以以直接在浏览器请求信息中拿到

  • 无cookies,未登录时

#!coding:utf-8
import requests

url = 'http://t.dianping.com/deal/22752400'
header = {'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8',
          'Accept-Encoding': 'gzip, deflate',
          'Accept-Language': 'zh-CN,zh;q=0.9',
          'Cache-Control': 'max-age=0',
          'Connection': 'keep-alive',
          'Host': 't.dianping.com',
          'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/64.0.3282.186 Safari/537.36'
          }
print(requests.get(url, headers=header).text)

无cookies未登录时访问结果如下:

python requests添加cookies模拟登陆爬取网页


  • 有cookies,模拟登陆场景
#!coding:utf-8
import requests

url = 'http://t.dianping.com/deal/22752400'
header = {'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8',
          'Accept-Encoding': 'gzip, deflate',
          'Accept-Language': 'zh-CN,zh;q=0.9',
          'Cache-Control': 'max-age=0',
          'Connection': 'keep-alive',
          'Cookie': 'cy=258; cye=guiyang; _lx_utm=utm_source%3DBaidu%26utm_medium%3Dorganic; _lxsdk_cuid=1627487eabec8-0082feec299df4-3e3d5f01-100200-1627487eabfc8; _lxsdk=1627487eabec8-0082feec299df4-3e3d5f01-100200-1627487eabfc8; _hc.v=7fb40515-c2b8-59d3-2b47-427bcabb3554.1522373487; _dp.ac.v=f5832f3d-885a-440c-9a2d-4f0a221ea73e; dper=ce3cbad9cf126491bef9842a52d26dfd28d3e6b65494f66952c02618cef002b7; ll=7fd06e815b796be3df069dec7836c3df; ua=15329319971; JSESSIONID=791B2B83CA269DB065936DE85C79DEDA; _lxsdk_s=16275323b8a-80c-56c-04a%7C%7C2',
          'Host': 't.dianping.com',
          'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/64.0.3282.186 Safari/537.36'
          }

print(requests.get(url, headers=header).text)

有cookies登陆之后可以得到网页信息,结果如下:

python requests添加cookies模拟登陆爬取网页