Pythonic找到频率最高的元素？

问题描述：

lst = [1, 3, 5, 1, 5, 6, 1, 1, 3, 4, 5, 2, 3, 4, 5, 3, 4]

我想找到发生最频繁的所有的元素。所以想：

most = [1, 3, 5]

1，图3和5将发生最大，这是4倍。什么是快速，pythonic的方式来做到这一点？我试过这里显示的方法：

How to find most common elements of a list?。

但它只给我前3名，我需要所有元素。谢谢。

，在链接回答做了切片'popular_words [：3 ]'只返回前3名。实际计数器包含所有总数，而不仅仅是前3名。 – krock

答

随着collections.Counter和列表理解：

from collections import Counter 

lst = [1, 3, 5, 1, 5, 6, 1, 1, 3, 4, 5, 2, 3, 4, 5, 3, 4] 
r = [x for x, _ in Counter(lst).most_common(3)] 
print(r) 
# [1, 3, 5]

您可以通过使用计数器值max概括为最高的计数值：

c = Counter(lst) 
m = max(c.values()) 
r = [k for k in c if c[k] == m] 
print(r) 
# [1, 3, 5]

对于大型iterables，有效地迭代通过计数器并停止一次所需的物品后，您可以使用与most_common而不带任何参数：

from itertools import takewhile 

c = Counter(lst) 
m = max(c.values()) 
r = [x for x, _ in takewhile(lambda x: x[1]==m, c.most_common())] 
print(r) 
# [1, 3, 5]

你获得通过不通过柜台对象中的所有项目进行迭代，虽然有一定的开销不必使用most_common排序的项目;所以我肯定这个绝对是毕竟是高效的。你可以用timeit做一些实验。

这是作品，但我想找到所有**元素出现频率最高，而不仅仅是前3个。谢谢你。 –

@ArjunVasudevan我已更新为一般情况 –

答

你可以做以下的，如果你想打印所有最常见的，

from collections import Counter 
    words=[1, 3, 5, 1, 5, 6, 1, 1, 3, 4, 5, 2, 3, 4, 5, 3, 4] 
    most= [word for word, word_count in Counter(words).most_common()] 
    print (most) 
>>> 
[1, 3, 5, 4, 2, 6]

请注意，如果你想限制，你可以进入里面most_common()功能的数量。例如：...most_common(3)]。希望这回答你的问题。

答

您也可以用groupby从itertools模块和list comprehension以这种方式得到相同的结果：

from itertools import groupby 

a = [1, 3, 5, 1, 5, 6, 1, 1, 3, 4, 5, 2, 3, 4, 5, 3, 4] 
most_common = 3 
final = [k for k,v in groupby(sorted(a), lambda x: x) if len(list(v)) > most_common]

输出：

print(final) 
>>> [1, 3, 5]

那么，这假设您已经有一个先验阈值 –

是的。这是真的。但它可以扩展到处理所有情况。 –

Pythonic找到频率最高的元素？

相关推荐