根据列值从熊猫数据框中提取行
问题描述:
我该如何解决从哪些列匹配从Excel文件创建的数据框中的特定值的行?根据列值从熊猫数据框中提取行
以下是数据帧的几行:
Food Men Women
0 Total fruit 86.20 88.26
1 Apples, Total 89.01 89.66
2 Apples as fruit 89.18 90.42
3 Apple juice 88.78 88.42
4 Bananas 95.42 94.18
5 Berries 84.21 81.73
6 Grapes 88.79 88.13
,这是我用来读取Excel文件中的代码,选择我需要的列,并适当对其进行重命名:
data1= pd.read_excel('USFoodCommodity.xls', sheetname='94-98 FAH', skiprows=76,skip_footer=142, parse_cols='A, H, K')
data1.columns = ['Food', 'Men', 'Women']
# Try 1: data1 = data1[data1['Food'].isin(['Total fruit']) == True] works
# Try 2: data1 = data1[data1['Food'].isin(['Apple, Total']) == True] doesn't work
# Try 3: data1 = data1.iloc[[1]] returns Apples, Total but not appropriate to use integer index
# Try 4: data1[data1['Food'] == 'Berries'] doesn't work
到目前为止,基于this,this或here等答案,我只能返回Food =“Total fruit”的第一个索引。当我尝试其他方法上面我只得到了列名,如:
Food Men Women
我是新来的熊猫和看不到的地方,我错了。为什么我可以提取第一行Food == Total水果,但没有其他的东西?
答
对我来说工作良好,也许问题与一些空格 - 由strip
其删除:
print (data1.Food.tolist())
['Total fruit', 'Apples, Total ', 'Apples as fruit',
'Apple juice', 'Bananas', ' Berries', 'Grapes']
data1['Food'] = data1['Food'].str.strip()
print (data1.Food.tolist())
['Total fruit', 'Apples, Total', 'Apples as fruit',
'Apple juice', 'Bananas', 'Berries', 'Grapes']
data2 = data1[data1['Food'].isin(['Total fruit'])]
print (data2)
Food Men Women
0 Total fruit 86.2 88.26
data3 = data1[data1['Food'].isin(['Apples, Total'])]
print (data3)
Food Men Women
1 Apples, Total 89.01 89.66
data3 = data1[data1['Food'].isin(['Berries'])]
print (data3)
Food Men Women
5 Berries 84.21 81.73
答
使用此代码
data1= pd.read_excel('USFoodCommodity.xls', sheetname='94-98 FAH', skiprows=76,skip_footer=142, parse_cols='A, H, K')
list_of_strings_to_match = ['Total fruit', 'Berries', 'Grape']
for index, row in data1.iterrows():
if row['Food'] in list_of_strings_to_match:
print row
浆果或葡萄没有行结果 – dreamin