接受的大熊猫数据框顶部行基于分组
问题描述:
相关的问题在这里:Reordering pandas dataframe based on multiple column and sum of one column接受的大熊猫数据框顶部行基于分组
我如何使用sort
列时接受前2个国家在这个数据帧,: 在这种情况下,顶部2个国家将在澳大利亚和阿富汗
Country_FAO type mean_area sort
5 Australia car 12141000.0 18910501.0
4 Australia car 6475695.0 18910501.0
6 Australia bus 293806.0 18910501.0
0 Afghanistan car 2029000.0 2141000.0
1 Afghanistan car 112000.0 2141000.0
2 Algeria bus 827000.0 829351.0
3 Algeria bus 2351.0 829351.0
- 编辑:
我也想保留type
列。在这种情况下,解决方案应该是这样的:
Country_FAO type mean_area sort
5 Australia car 12141000.0 18910501.0
4 Australia car 6475695.0 18910501.0
6 Australia bus 293806.0 18910501.0
0 Afghanistan car 2029000.0 2141000.0
1 Afghanistan car 112000.0 2141000.0
答
UPDATE:
In [166]: df.loc[df.Country_FAO.isin(df.groupby('Country_FAO').sum().nlargest(2, 'mean_area').index)]
Out[166]:
Country_FAO type mean_area sort
5 Australia car 12141000.0 18910501.0
4 Australia car 6475695.0 18910501.0
6 Australia bus 293806.0 18910501.0
0 Afghanistan car 2029000.0 2141000.0
1 Afghanistan car 112000.0 2141000.0
我会做这种方式:
In [153]: df.groupby('Country_FAO').sum()
Out[153]:
mean_area
Country_FAO
Afghanistan 2141000.0
Algeria 829351.0
Australia 18910501.0
In [154]: df.groupby('Country_FAO').sum().nlargest(2, 'mean_area')
Out[154]:
mean_area
Country_FAO
Australia 18910501.0
Afghanistan 2141000.0
In [155]: df.groupby('Country_FAO').sum().nlargest(2, 'mean_area').index
Out[155]: Index(['Australia', 'Afghanistan'], dtype='object', name='Country_FAO')
还,您可能需要重置您的索引:
In [156]: df.groupby('Country_FAO').sum().nlargest(2, 'mean_area').reset_index()
Out[156]:
Country_FAO mean_area
0 Australia 18910501.0
1 Afghanistan 2141000.0
谢谢@MaxU,这个soln删除'type'列,有没有办法保留这个? – user308827
@ user308827,我已经更新了我的答案 - 请检查 – MaxU
谢谢@MaxU,此作品! – user308827