在大熊猫扩大行数据帧
问题描述:
,我有以下数据:在大熊猫扩大行数据帧
product Sales_band Hour_id sales
prod_1 HIGH 1 200
prod_1 HIGH 3 100
prod_1 HIGH 4 300
prod_1 VERY HIGH 2 100
prod_1 VERY HIGH 5 253
prod_1 VERY HIGH 6 234
要添加一个基于hour_id值的行。 hour_id变量的值可以从1到10。因此,上述相同的数据将在缺少小时id的位置展开。虚拟输出:(销售= 0失踪小时ID)
product Sales_band Hour_id sales
prod_1 HIGH 1 200
prod_1 HIGH 2 0
prod_1 HIGH 3 100
prod_1 HIGH 4 300
prod_1 HIGH 5 0
prod_1 HIGH 6 0
prod_1 HIGH 7 0
prod_1 HIGH 8 0
prod_1 HIGH 9 0
prod_1 HIGH 10 0
prod_1 VERY HIGH 1 0
prod_1 VERY HIGH 2 100
prod_1 VERY HIGH 3 0
prod_1 VERY HIGH 4 0
prod_1 VERY HIGH 5 253
prod_1 VERY HIGH 6 234
prod_1 VERY HIGH 7 0
prod_1 VERY HIGH 8 0
prod_1 VERY HIGH 9 0
prod_1 VERY HIGH 10 0
我怎么能做到这一点使用python数据帧时。
答
print (df.groupby(['product','Sales_band'])['Hour_id','sales']
.apply(lambda x: x.set_index('Hour_id').reindex(range(1, 11), fill_value=0))
.reset_index())
product Sales_band Hour_id sales
0 prod_1 HIGH 1 200
1 prod_1 HIGH 2 0
2 prod_1 HIGH 3 100
3 prod_1 HIGH 4 300
4 prod_1 HIGH 5 0
5 prod_1 HIGH 6 0
6 prod_1 HIGH 7 0
7 prod_1 HIGH 8 0
8 prod_1 HIGH 9 0
9 prod_1 HIGH 10 0
10 prod_1 VERY HIGH 1 0
11 prod_1 VERY HIGH 2 100
12 prod_1 VERY HIGH 3 0
13 prod_1 VERY HIGH 4 0
14 prod_1 VERY HIGH 5 253
15 prod_1 VERY HIGH 6 234
16 prod_1 VERY HIGH 7 0
17 prod_1 VERY HIGH 8 0
18 prod_1 VERY HIGH 9 0
19 prod_1 VERY HIGH 10 0
您应该结束了,每个产品和销售带10行? –
是的,这应该是理想的最终输出 – Mukul