熊猫:为什么DataFrame.apply(f,axis = 1)在DataFrame为空时调用f?

问题描述:

为什么Pandas的DataFrame.apply方法在DataFrame为空时调用正在应用的函数?熊猫:为什么DataFrame.apply(f,axis = 1)在DataFrame为空时调用f?

例如:

>>> import pandas as pd 
>>> df = pd.DataFrame({"foo": []}) 
>>> df 
Empty DataFrame 
Columns: [foo] 
Index: [] 
>>> x = [] 
>>> df.apply(x.append, axis=1) 
Series([], dtype: float64) 
>>> x 
[Series([], dtype: float64)] # <<< why was the apply callback called with an empty row? 

挖掘到大熊猫源,它看起来像这样是罪魁祸首:

if not all(self.shape): 
    # How to determine this better? 
    is_reduction = False 
    try: 
     is_reduction = not isinstance(f(_EMPTY_SERIES), Series) 
    except Exception: 
     pass 

    if is_reduction: 
     return Series(NA, index=self._get_agg_axis(axis)) 
    else: 
     return self.copy() 

它看起来像熊猫是调用不带参数的函数,企图猜测结果应该是Series还是DataFrame

我想补丁是有序的。

编辑:这个问题已经被修补,现在都被记录在案,并允许reduce选项可以用来避免它:http://pandas.pydata.org/pandas-docs/dev/generated/pandas.DataFrame.apply.html