熊猫数据框:按A分组,取最大B,输出C

问题描述:

根据B中的值,每个A的前两个C值是多少?熊猫数据框:按A分组,取最大B,输出C

df = pd.DataFrame({ 
      'A': ["first","second","second","first", 
         "second","first","third","fourth", 
         "fifth","second","fifth","first", 
         "first","second","third","fourth","fifth"], 
      'B': [1,1,1,2,2,3,3,3,3,4,4,5,6,6,6,7,7], 
      'C': ["a", "b", "c", "d", 
        "e", "f", "g", "h", 
        "i", "j", "k", "l", 
        "m", "n", "o", "p", "q"]}) 

我想

x = df.groupby(['A'])['B'].nlargest(2) 

    A 
    fifth 16 7 
      10 4 
    first 12 6 
      11 5 
    fourth 15 7 
      7  3 
    second 13 6 
      9  4 
    third 14 6 
      6  3 

但这种下降C列,那就是实际价值,我需要。

我想要结果中的C,而不是原始df的行索引。我必须加入备份吗?我甚至可以花一点点C单独的列表...

我需要作用于顶部2 C值(基于B)为每A.

IIUC:

In [42]: df.groupby(['A'])['B','C'].apply(lambda x: x.nlargest(2, columns=['B']) 
Out[42]: 
      B C 
A 
fifth 16 7 q 
     10 4 k 
first 12 6 m 
     11 5 l 
fourth 15 7 p 
     7 3 h 
second 13 6 n 
     9 4 j 
third 14 6 o 
     6 3 g 
+0

这它。就这一点让我的头痛了好几个小时。熊猫和Python都是新手,这对您有所帮助!谢谢MaxU! – user4445586