建设应急表

问题描述：

我有一个表是这样的：建设应急表

df <- data.frame(P1 = c(1,0,0,0,0,0,"A"), 
        P2 = c(0,-2,1,2,1,0,"A"), 
        P3 = c(-1,2,0,2,1,0,"B"), 
        P4 = c(2,0,-1,0,-1,0,"B"), 
        Names = c("G1","G2","G3","G1","G2","G3","Group"), 
        stringsAsFactors = FALSE)

，这已经成为

Names P1 P2 P3 P4 
G1  1 0  -1 2 
G2  0 -2 2 0 
G3  0 1  0 -1 
G1  0 2  2 0 
G2  0 1  1 -1 
G3  0 0  0 0 
Group A A  B B

这里，A和B是分组变量P1, P2, P3, P4。

我想建立Ids应急（G1，G2 ...），Group（A，B）和Var（-2,-1,0,1,2）表，例如：

Id Group Var Count 
G1 A  -2  0 
G1 A  -1  0 
G1 A  0  1 
G1 A  1  1 
G1 A  2  0 
G1 B  -2  0 
G1 B  -1  1 
G1 B  0  0 
G1 B  1  0 
G1 B  2  1 
G2 A  -2  1 
G2 A  -1  0 
G2 A  0  1 
...

有没有办法做到它在R中没有使用大量的循环？

（HTTP【如何使一个伟大的[R重复的例子？]：//计算器。 com/questions/5963269） – Sotos

谢谢@索托斯，我加了df – Sosi

我觉得你的输出与你的'df'不一致：不应该'组'是一个变量？它连续出现...... – mdag02

答

假设你要组P1 & P2列作为A和P3 & P4列作为B，你可以用data.table -package如下来解决：

library(data.table) 
DT <- melt(melt(setDT(df), 
       measure.vars = list(c(2,3),c(4,5)), 
       value.name = c("A","B")), 
      id = 1, measure.vars = 3:4, variable.name = 'group' 
      )[order(Id,group)][, val2 := value] 

DT[CJ(Id = Id, group = group, value = value, unique = TRUE) 
    , on = .(Id, group, value) 
    ][, .(counts = sum(!is.na(val2))), by = .(Id, group, value)]

导致：

Id group value counts 
1: G1  A -2  0 
2: G1  A -1  0 
3: G1  A  0  2 
4: G1  A  1  1 
5: G1  A  2  1 
6: G1  B -2  0 
7: G1  B -1  1 
8: G1  B  0  1 
9: G1  B  1  0 
10: G1  B  2  2 
11: G2  A -2  1 
12: G2  A -1  0 
13: G2  A  0  2 
14: G2  A  1  1 
15: G2  A  2  0 
16: G2  B -2  0 
17: G2  B -1  1 
18: G2  B  0  1 
19: G2  B  1  1 
20: G2  B  2  1 
21: G3  A -2  0 
22: G3  A -1  0 
23: G3  A  0  3 
24: G3  A  1  1 
25: G3  A  2  0 
26: G3  B -2  0 
27: G3  B -1  1 
28: G3  B  0  3 
29: G3  B  1  0 
30: G3  B  2  0

使用的数据

df <- read.table(text="Id  P1 P2 P3 P4 
G1  1 0 -1 2 
G2  0 -2 2  0 
G3  0 1 0  -1 
G1  0 2 2  0 
G2  0 1 1  -1 
G3  0 0 0  0", header=TRUE, stringsAsFactors = FALSE)

注意，我省略了“Group'行，因为你的意见，这些都只是为了表示对群体P1其中指出 - P4列应属于。

的确，非常感谢！ – Sosi

答

随着

library(tidyverse) 

df <- read.table(text="Id  P1 P2 P3 P4 
G1  1 0 -1 2 
G2  0 -2 2  0 
G3  0 1 0  -1 
G1  0 2 2  0 
G2  0 1 1  -1 
G3  0 0 0  0", header=TRUE, stringsAsFactors = FALSE)

我们重塑表和group重新编码P*变量。然后我们计算并完成遗失的案例。导致：

df %>% 
    gather(P1, P2, P3, P4, key = "p", value = "v") %>% 
    mutate(group = ifelse(p %in% c("P1", "P2"), "A", "B")) %>% 
    group_by(Id, group, v) %>% 
    summarise(Count = n()) %>% 
    ungroup() %>% 
    complete(Id, group, v, fill = list("Count" = 0))

如果你不需要输出中的所有组合，只需使用：

df %>% 
    gather(P1, P2, P3, P4, key = "p", value = "v") %>% 
    mutate(group = ifelse(p %in% c("P1", "P2"), "A", "B")) %>% 
    group_by(Id, group, v) %>% 
    summarise(Count = n()) 

# A tibble: 17 x 4 
# Groups: Id, group [?] 
     Id group v  Count 
     <chr> <chr> <int> <int> 
1 G1  A  0  2 
2 G1  A  1  1 
3 G1  A  2  1 
4 G1  B -1  1 
5 G1  B  0  1 
6 G1  B  2  2 
7 G2  A -2  1 
8 G2  A  0  2 
9 G2  A  1  1 
10 G2  B -1  1 
11 G2  B  0  1 
12 G2  B  1  1 
13 G2  B  2  1 
14 G3  A  0  3 
15 G3  A  1  1 
16 G3  B -1  1 
17 G3  B  0  3

相关推荐