滚动的客户ID和产品水平
我对格式的数据R中的数据: -滚动的客户ID和产品水平
id product mcg txn
101 gold hotel 1
101 gold hotel 2
101 clas hotel 22
101 clas airline 23
我想要的输出
hotel_txn airline_txn
101 gold 3 .
101 clas 22 23
任何人都可以请其帮助我期望的输出?
基本上,我正在寻找一个替代案例时,SAS语句?
我们可以使用xtabs
xtabs(txn~idprod + mcg, transform(df1, idprod = paste(id, product),
mcg = paste0(mcg, "_txn")))
# mcg
#idprod airline_txn hotel_txn
# 101 clas 23 22
# 101 gold 0 3
您可以使用dplyr
和tidyr
做到这一点:
library(dplyr)
library(tidyr)
df %>% group_by(id, product, mcg) %>% summarise(txn = sum(txn)) %>% spread(mcg, txn)
Source: local data frame [2 x 4]
Groups: id, product [2]
id product airline hotel
<int> <fctr> <int> <int>
1 101 clas 23 22
2 101 gold NA 3
是给出MCG列不存在的错误。可以帮忙吗? –
Reshape2的dcast功能是专为这种东西:
#creates your data frame
df <- data.frame(id = c(101, 101, 101, 101),
product = c("gold", "gold", "clas", "clas"),
mcg = c("hotel", "hotel", "hotel", "airline"),
txn = c(1, 2, 22, 23))
#installs and loads the required package
install.packages("reshape2")
library(reshape2)
#the function you would use to create the new data frame
df2 <- dcast(df, id + product ~ mcg, value.var = "txn", sum)
print(df2)
id product airline hotel
1 101 clas 23 22
2 101 gold 0 3
id产品航空公司酒店黄金类 1 101等级23 22 3 45 CAn我们得到这种形式的数据? –
@ankitagarwal您能否澄清您的要求?我不明白你在评论中要求什么。 – bshelt141
尝试'库(data.table); dcast(setDT(df1),id + product〜mcg,value.var =“txn”,sum)' – akrun