python爬虫爬取中国天气网各城市天气数据(柱状图展示和中国地图展示)(pyquery+pyecharts )...
python爬虫爬取中国天气网各城市天气数据(柱状图展示和中国地图展示)(pyquery+pyecharts )
1、柱状图展示版本:
使用pyquery爬取了中国天气网各城市的最高温度,(因天气网到了傍晚会显示当天的温度为空,所以这里爬取的是天气网前一天(即昨天)的数据。
对爬取到的最高温度数据按温度从高到低排序,并使用pyecharts的柱状图进行展示。
from pyquery import PyQuery as pq list=[] listdept=[] base='http://www.weather.com.cn' url='http://www.weather.com.cn/textFC/hb.shtml' doc = pq(url=url,encoding='utf-8') for dept in doc('.lq_contentboxTab2 li a'): listdept.append(base+pq(dept).attr('href')) for url in listdept[:-1]: doc = pq(url=url,encoding='utf-8') provinces=doc('.conMidtab')[1] for province in provinces: str_province=pq(province)('.rowsPan a').html() citys=pq(province)('tr') for city in citys[2:]: if citys.index(city)==2: list.append({'city':pq(city[1])('a').html(),'max':pq(city[4]).html()}) else: list.append({'city':pq(city[0])('a').html(),'max':pq(city[3]).html()}) list=sorted(list,key=lambda x:int(x['max']),reverse=True) citylist=[] maxlist=[] temperlist=[] for item in list: citylist.append(item['city']) maxlist.append(item['max']) from pyecharts import Bar bar = Bar("最高温度排名", "by babihuang") bar.add("最高温度", citylist, maxlist,is_more_utils=True) bar.show_config() bar.render()
2、中国地图展示版本:
如果地图不能展示,请使用PIP安装地图包:
pip install echarts-countries-pypkg pip install echarts-china-provinces-pypkg pip install echarts-china-cities-pypkg
因为天气数据里涉及的城市并没有在地图包中完整匹配,在此只在地图上显示每个省的省会的数据。
from pyquery import PyQuery as pq list=[] listdept=[] base='http://www.weather.com.cn' url='http://www.weather.com.cn/textFC/hb.shtml' doc = pq(url=url,encoding='utf-8') for dept in doc('.lq_contentboxTab2 li a'): listdept.append(base+pq(dept).attr('href')) for url in listdept[:-1]: doc = pq(url=url,encoding='utf-8') provinces=doc('.conMidtab')[1] for province in provinces: str_province=pq(province)('.rowsPan a').html() citys=pq(province)('tr') for city in citys[2:3]: if citys.index(city)==2: list.append({'city':pq(city[1])('a').html(),'max':pq(city[4]).html()}) else: list.append({'city':pq(city[0])('a').html(),'max':pq(city[3]).html()}) list=sorted(list,key=lambda x:int(x['max']),reverse=True) temperlist=[] from pyecharts import Geo for item in list: temperlist.append((item['city']+"市",item['max'])) data = temperlist geo = Geo("全国省会城市高温排名", "by babihuang", title_color="#fff", title_pos="center", width=1200, height=600, background_color='#404a59') attr, value = geo.cast(data) geo.add("", attr, value, visual_range=[0, 45], visual_text_color="#fff", symbol_size=15, is_visualmap=True) geo.show_config() geo.render()