页面在浏览器中重定向但不在python中

问题描述：

我试图测试站点以查看它们是否从HTTP重定向到HTTPS。这是我的代码。页面在浏览器中重定向但不在python中

import requests 
url = "http://www.google.com" 
page = requests.get(url) 
if page.history: 
    print ("Request was redirected") 
    for resp in page.history: 
     print (resp.status_code, resp.url) 
    print ("Final destination:") 
    print (page.status_code, page.url) 
else: 
    print (page.headers) 
    print (page.history) 
    print(page.url) 
    print(page.status_code) 
    print ("Request was not redirected")

当我使用各种在线头跳棋测试http://www.google.com我获得了一个302重定向到HTTPS站点。但是，当我运行上面的代码时，我得到了一个200状态码和一个页面结果。但是，当我运行代码与像http://fb.com这样的网站时，我得到以下结果。

Request was redirected 
301 http://fb.com/ 
302 http://www.facebook.com/?_rdr 
Final destination: 
200 https://www.facebook.com/

这仅仅是一些谷歌的事情，或者我错过了什么。

你得到的页面是什么？ Google可能阻止你？ – jsfan

为我工作，但尝试通过一个真正的用户代理 –

答

谷歌根据用户代理字符串做了很多魔术。尝试提取为

page = requests.get(url, headers={'User-Agent': 'Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2228.0 Safari/537.36'})

或与其他用户代理字符串，看看是否改变了行为。

另外，请注意，如果您使用脚本访问Google，即使您拥有真实的用户代理字符串，也只需至少阻止并查看验证码即可。

你在哪里得到了这个想法，我想知道...... –

@Padraic Cunnigham：我知道它看起来像我铲你，但我没有。我实际上是在尝试与发表评论相同的内容。 – jsfan

非常感谢你们俩。我在代码中的其他地方在httplib中使用了真正的用户代理。应该明白这一点。 –

页面在浏览器中重定向但不在python中

相关推荐