熊猫：如何将矩阵与不同的索引和列结合起来？

问题描述：

我正在使用python，熊猫和numpy来读取一些数据。熊猫：如何将矩阵与不同的索引和列结合起来？

我有两个数据帧：

输入1-成本矩阵（它具有每季节和地区的费用）：索引=区域和列=季节输入2-二进制矩阵（值1时一个月“ a“属于季节”b“：index = seasons，columns = months

我想要的输出是一个矩阵C，它具有每个区域和月份的成本：index = region，column month。

任何人都可以请帮我吗？我用Google搜索了很多，但我不能找到解决方案。

我的代码更新

import pandas as pd 
import numpy as np 
from xlwings import Workbook, Range 
import os 
print(os.getcwd()) 
link = (os.getcwd() + '/test.xlsx') 
print(link) 

#Open the Workbook 
wb = Workbook(link) 
# 
#Reading data 

regions=np.array(Range('Sheet1','regions').value) 
#[u'Region A' u'Region B' u'Region C' u'Region D'] 

seasons=np.array(Range('Sheet1','seasons').value) 
#[u'Season A' u'Season B' u'Season C' u'Season D'] 

months=np.array(Range('Sheet1','months').value) 
#[u'Jan' u'Feb' u'Mar' u'Apr' u'May' u'Jun' u'Jul' u'Aug'] 

#read relationship between season and month 
data=Range('Sheet1','rel').table.value 
relationship=pd.DataFrame(data[0:], index = regions, columns=months) 
#   Jan Feb Mar Apr May Jun Jul Aug 
#Region A 1 1 0 0 0 0 0 0 
#Region B 0 0 1 1 0 0 0 0 
#Region C 0 0 0 0 1 1 0 0 
#Region D 0 0 0 0 0 0 1 1 

#read the cost per region 
data=Range('Sheet1','cost').table.value 
cost=pd.DataFrame(data[0:], index = regions, columns=seasons) 
#   Season A Season B Season C Season D 
#Region A   1   9   7   2 
#Region B   7   0   3   3 
#Region C   4   0   7   5 
#Region D   3  10   3  10 


#What I want: 
#  Jan Feb Mar Apr May Jun Jul Aug 
#Region A 1 1 9 9 7 7 2 2 
#Region B 7 7 0 0 3 3 3 3 
#Region C 4 4 0 0 7 7 5 5 
#Region D 3 3 10 10 3 3 10 10

你能提供你的数据框的样本数据吗？ –

答

我相信，在您的示例中的关系数据框中一个错误，因为你明确规定，它应该是赛季（而不是区域）和月份之间的关系，所以我相应地改变了它。

import pandas as pd 
import numpy as np 

regions = ['Region A', 'Region B', 'Region C', 'Region D'] 
seasons = ['Season A', 'Season B', 'Season C', 'Season D'] 
cost_data = np.array([[1, 9, 7, 2], [7, 0, 3, 3], [4, 0, 7, 5], [3, 10, 3, 10]]) 

cost = pd.DataFrame(data=cost_data, index=regions, columns=seasons) 

months = ['Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun', 'Jul', 'Aug'] 
rel_data = np.array([[1, 1, 0, 0, 0, 0, 0, 0], 
        [0, 0, 1, 1, 0, 0, 0, 0], 
        [0, 0, 0, 0, 1, 1, 0, 0], 
        [0, 0, 0, 0, 0, 0, 1, 1]]) 

rel = pd.DataFrame(data=rel_data, index=seasons, columns=months) 

c = pd.DataFrame(index=regions, columns=months) 
for region in regions: 
    for month in months: 
     for season in seasons: 
      if rel.loc[season][month]: 
       c.loc[region][month] = cost.loc[region][season] 

print c 

#   Jan Feb Mar Apr May Jun Jul Aug 
#Region A 1 1 9 9 7 7 2 2 
#Region B 7 7 0 0 3 3 3 3 
#Region C 4 4 0 0 7 7 5 5 
#Region D 3 3 10 10 3 3 10 10

嘿，我用我的代码更新了我的问题......我试图合并，但我认为我的做法是错误的......我怎么能从我在第一个问题中添加的示例中做到这一点？ –

熊猫：如何将矩阵与不同的索引和列结合起来？

相关推荐