什么是从http服务器下载hdf5文件的pythonic方式？

问题描述：

我正在尝试从http服务器下载hdf5文件。我可以使用python子模块和wget做到这一点，但我觉得我骗什么是从http服务器下载hdf5文件的pythonic方式？

# wget solution 
    import subprocess 
    url = 'http://url/to/file.h5' 
    subprocess(['wget', '--proxy=off', url])

我还可以使用urllib而请求模块下载图片是这样的：当

# requests solution 
    url2 = 'http://url/to/image.png' 
    r = requests.get(url2) 
    with open('image.png', 'wb') as img: 
    img.write(r.content) 

    # urllib solution 
    urllib.urlretrieve(url2, 'outfile.png')

然而，我尝试下载HDF5文件用这种方法和运行shell命令“文件”我得到：

>file test.h5 
    >test.h5: HTML document, ASCII text, with very long lines

这里是requests.get头（）（不知道是否有帮助）

{'accept-ranges': 'bytes', 
    'content-length': '413399', 
    'date': 'Tue, 19 Feb 2013 08:51:06 GMT', 
    'etag': 'W/"413399-1361177055000"', 
    'last-modified': 'Mon, 18 Feb 2013 08:44:15 GMT', 
    'server': 'Apache-Coyote/1.1'}

我应该使用wget通过子进程还是有pythonic解决方案？

解决方案： 的问题的事实，我没有禁止代理之前，我试图下载该文件，并正因为如此，传输被截获引起的。这段代码做了诀窍。

import urllib2 
    proxy_handler = urllib2.ProxyHandler({}) 
    opener = urllib2.build_opener(proxy_handler) 
    urllib2.install_opener(opener) 

    url = 'http://url/to/file.h5' 

    req = urllib2.Request(url) 
    r = opener.open(req) 
    result = r.read() 

    with open('my_file.h5', 'wb') as f: 
     f.write(result)

随着urllib，你真的看了看文件？这听起来更像是截获请求，并且您正在下载一个html文档，而不是您要查找的png。那么也许你可以找到更好的链接？ – Oin 2013-02-19 09:33:10

你是对的，请求被拦截！当我禁用代理并做了其他的事情时，我终于得到了正确的文件。这是我的第一篇文章。将我的完整解决方案包含在原始问题中可以吗？我应该以某种方式编辑问题的标题吗？ – LordBullingdon 2013-02-19 10:35:57

是的，你可以编辑你的问题，然后张贴你的答案。 – Oin 2013-02-19 10:38:40

答

尝试使用urllib.geturl才能获得真正的URL（以下重定向），然后传递到urlretrieve。

什么是从http服务器下载hdf5文件的pythonic方式？

相关推荐