加权平均值与ddply是错误的(R,ddply)
问题描述:
我需要建立的加权平均中R.加权平均值与ddply是错误的(R,ddply)
塌陷行时数据
收起由品牌和名称name = c("car1", "car2", "car2", "car2", "car3", "car1")
brand = c("b1", "b2", "b2", "b2", "b3", "b1")
production = c(10, 10, 30, 40, 10, 5)
fuelEconomy= c(1, 2, 3, 5, 2, 4)
size = c(10, 50, 30,40,20, 7)
adf = data.frame(brand, name, production, fuelEconomy, size)
adfSum <- ddply(adf, .(brand, name),
summarise,
fuelEconomySum = sum(fuelEconomy*production)/sum(production),
productionSum = sum(production),
sizeSum = (sum(size*production)/sum(production)))
结果: 第一个加权平均值(fuelEconomySum)是正确的,但最后一个sizeSum是不正确的。正确的值在括号中。
brand name fuelEconomySum production sizeSum
b1 car1 2.000 15 17 (9)
b2 car2 3.875 80 120 (37.5)
b3 car3 2.000 10 20 (20)
我正在寻找一种解决方案来同时创建多个加权平均值。
感谢
答
这工作(使用dplyr
和magrittr
):
name = c("car1", "car2", "car2", "car2", "car3", "car1")
brand = c("b1", "b2", "b2", "b2", "b3", "b1")
production = c(10, 10, 30, 40, 10, 5)
fuelEconomy= c(1, 2, 3, 5, 2, 4)
size = c(10, 50, 30,40,20, 7)
adf = data.frame(brand, name, production, fuelEconomy, size)
library(magrittr)
library(dplyr)
afdSum <- adf %>%
group_by(brand, name) %>%
summarise(fuelEconomySum = sum(fuelEconomy*production)/sum(production),
productionSum = sum(production),
sizeSum = sum(size*production)/sum(production)) %>%
as.data.frame()
> afdSum
brand name fuelEconomySum productionSum sizeSum
1 b1 car1 2.000 15 9.0
2 b2 car2 3.875 80 37.5
3 b3 car3 2.000 10 20.0
编辑:您的解决方案,顺便说一下,工作正常,我。
> devtools::session_info("plyr")
Session info ---------------------------------------------------------------------------
setting value
version R version 3.3.1 (2016-06-21)
system x86_64, linux-gnu
ui RStudio (0.99.491)
language en_US
collate en_US.UTF-8
tz <NA>
date 2016-09-14
Packages -------------------------------------------------------------------------------
package * version date source
plyr * 1.8.3 2015-06-12 CRAN (R 3.3.0)
Rcpp 0.12.5 2016-05-14 CRAN (R 3.3.0)
感谢您的贡献。 我发现了错误。这是在我的变量的命名。我将变量名称更改为productionSum,以便在本文中明确说明。但在我的脚本中,我只是把它命名为production,这与我的输入相同。这导致了这样一个事实,即最后的操作已经把生产的总和而不是单个的价值。 –