Python类型错误:'numpy.int32'对象是不可迭代的

问题描述:

我想获取我的k-means结果数据框的熵,并且得到错误:TypeError:'numpy.int32'对象不可迭代 I不明白为什么。Python类型错误:'numpy.int32'对象是不可迭代的

from collections import Counter 
def calcEntropy(x): 
    p, lens = Counter(x), np.float(len(x)) 
    return -np.sum(count/lens*np.log2(count/lens) for count in p.values()) 
k_means_sp['entropy']=[calcEntropy(x) for x in k_means_sp['cluster']] 

,然后我得到的错误信息:

<ipython-input-26-d375ecf00330> in <module>() 
----> 1 k_means_sp['entropy']=[calcEntropy(x) for x in k_means_sp['cluster']] 

<ipython-input-26-d375ecf00330> in <listcomp>(.0) 
----> 1 k_means_sp['entropy']=[calcEntropy(x) for x in k_means_sp['cluster']] 

<ipython-input-23-f5508ea8782c> in calcEntropy(x) 
     1 from collections import Counter 
     2 def calcEntropy(x): 
----> 3  p, lens = Counter(x), np.float(len(x)) 
     4  return -np.sum(count/lens*np.log2(count/lens) for count in p.values()) 

/Users/mpiercy/anaconda/lib/python3.6/collections/__init__.py in __init__(*args, **kwds) 
    535    raise TypeError('expected at most 1 arguments, got %d' % len(args)) 
    536   super(Counter, self).__init__() 
--> 537   self.update(*args, **kwds) 
    538 
    539  def __missing__(self, key): 

/Users/mpiercy/anaconda/lib/python3.6/collections/__init__.py in update(*args, **kwds) 
    622      super(Counter, self).update(iterable) # fast path when counter is empty 
    623    else: 
--> 624     _count_elements(self, iterable) 
    625   if kwds: 
    626    self.update(kwds) 

TypeError: 'numpy.int32' object is not iterable 

k_means_sp.head() 

     credit debit cluster 
0 9.207673 8.198884 1 
1 4.248495 8.202181 0 
2 8.149668 7.735145 2 
3 5.138677 7.859741 0 
4 8.058163 7.918614 2 
+1

假设'k_means_sp'持有'numpy.int32',那么你就会''numpy.int32'传递给'Counter'。 'Counter'应该采用'iterable'。 –

+0

这是什么意思,我应该使群集列为cluster = [0,1,2]和y = iter(cluster),还是我这样做完全错了?谢谢! – bananablue1

+0

@ bananablue1这意味着你不能像当前写入的那样将一个整数传递给'calcEntropy'。正确的事情取决于你的目标。如果你想'calcEntropy'使用整数(这是否有意义?),那么你应该修复它,如果你想传递别的东西到'calcEntropy'然后传递其他东西,等等。 – Goyo

确定这是第一次尝试。看起来您的数据框存储了'cluster'列中的群集索引。所以,你需要做的是让基于索引的每个群集,然后传递集群您calcEntropy功能,像

for i in xrange(len(k_means_sp['cluster'].unique())) # loop thru cluster indices: 
    cluster = k_means_sp.ix[k_means_sp['cluster'] == i][['credit', 'debit']] 
    entropy = calcEntropy(cluster) 

第二行滤掉行只具有相同集群的那些指数。这有帮助吗?

+0

是的非常感谢你! – bananablue1