如何在R中使用apply函数,其中函数需要函数参数的同一列中的列名和值?

如何在R中使用apply函数,其中函数需要函数参数的同一列中的列名和值?

问题描述:

我想创建一个函数,使得对于某个数据框,它可以使用列名作为函数的第一个参数,并使用列的值(特定列的行值)作为函数中的第二个参数。然后,第二个参数的值将根据开关函数中设置的值转换为数值。如何在R中使用apply函数,其中函数需要函数参数的同一列中的列名和值?

这是我一直在努力的工作。

# I also put print("ERROR in Question")) if there is no match at all 
scoreraw <- function(Question, Answer) { 

    switch(Question, "Today is my favourite day?" = 
    {switch(Answer,"Strongly Agree" = 3,"Agree"= 2, "Disagree" = 1, "Strongly 
Disagree" = 0)}, 
    "I hate Tuesdays?"= 
    {switch(Answer,"Strongly Agree" = 0,"Agree"= 1, "Disagree" = 2, "Strongly 
Disagree" = 3)}, 
    print("ERROR in Question")) 
} 

这里是一个快速测试与功能展示它是如何工作的:

# We expect the value to be 3 based on the Question and Answer argument 
scoreraw("Today is my favourite day?","Strongly Agree") 
    # [1] 3 



#Let us now create a dummy dataset of questions 

x <- c("Strongly Agree","Agree","Disagree","Strongly Disagree") 
y <- c("Strongly Agree","Agree","Disagree","Strongly Disagree") 

c <- data.frame(x,y) 

# Just changing the names to match the questions in the switch statement 
colnames(c) <- c("Today is my favourite day?", "I hate Tuesdays?") 

# The two factors were converted to characters since factors are treated as 
# integers by default (I may be incorrect here) 
c$`Today is my favourite day?` <- as.character(c$`Today is my favourite day`) 
c$`I hate Tuesdays?` <- as.character(c$`I hate Tuesdays`) 

#>c 
# Today is my favourite day? I hate Tuesdays? 
# 1    Strongly Agree Strongly Agree 
# 2      Agree    Agree 
# 3     Disagree   Disagree 
# 4   Strongly Disagree Strongly Disagree 

这就是我想要的数据框看起来像将我的功能

# Today is my favourite day? I hate Tuesdays? 
# 1       3    0 
# 2       2    1 
# 3       1    2 
# 4       0    3 

我后试图使用apply函数,但我的问题是如何选择任意列名称并将该函数应用于特定列中的所有行值?此时我只能通过手动选择列名和某个行值来应用该功能。

没有能力
#Example of selecting column name and row value manually 
scoreraw(colnames(c)[2],c[1,2]) 
# [1] 0 

编辑当前工作的代码来选择任意列

# I also put print("ERROR in Question")) if there is no match at all 
scoreraw <- function(Question, Answer) { 

    switch(Question, "Today is my favourite day?" = 
    {switch(Answer,"Strongly Agree" = 3,"Agree"= 2, "Disagree" = 1, "Strongly 
Disagree" = 0)}, 
    "I hate Tuesdays?"= 
    {switch(Answer,"Strongly Agree" = 0,"Agree"= 1, "Disagree" = 2, "Strongly 
Disagree" = 3)}, 
    print("ERROR in Question")) 
} 


#Let us now create a dummy dataset of questions 

x <- c("Strongly Agree","Agree","Disagree","Strongly Disagree") 
y <- c("Strongly Agree","Agree","Disagree","Strongly Disagree") 

c <- data.frame(x,y) 

# Just changing the names to match the questions in the switch statement 
colnames(c) <- c("Today is my favourite day?", "I hate Tuesdays?") 

# The two factors were converted to characters since factors are treated as 
# integers by default (I may be incorrect here) 
c$`Today is my favourite day?` <- as.character(c$`Today is my favourite 
day`) 
c$`I hate Tuesdays?` <- as.character(c$`I hate Tuesdays`) 



call_scoreraw <- function(n, DF) { 
    sapply(DF[[n]], function(x) scoreraw(colnames(DF)[n], x)) 
} 

#I included unlist as I noticed the output can also be a list 
a <- unlist(call_scoreraw(1, c)) 
b <- as.data.frame(a) 

我现在试图将For循环在call_scoreraw功能的scoreraw功能适用于任何列/秒。

call_scoreraw <- function(n, DF) { 
    Storage <- numeric(ncol(DF)) 
    for (i in n:ncol(DF)){ 
    Storage[i] <- sapply(DF[,i], function(x) scoreraw(colnames(DF)[i], x)) 
    } 
} 

正如你所看到的,我目前需要找到一种方法来存储来自for循环的值。我无法使用已定义的存储变量执行此操作Storage有关如何执行此操作的任何建议?

+0

函数'scoreraw'中有一个输入错误,它应该是'Tuesdaydays'而不是'tuesdays'。 –

+0

谢谢我现在改变了错字。 @RuiBarradas – MrReference

定义另一个函数来调用scoreraw。就像这样:

call_scoreraw <- function(n, DF) { 
    if(length(n) > 1){ 
     t(sapply(n, function(i){ 
      sapply(DF[[i]], function(x) scoreraw(colnames(DF)[i], x)) 
     })) 
    } else { 
     sapply(DF[[n]], function(x) scoreraw(colnames(DF)[n], x)) 
    } 
} 

call_scoreraw(2, c) 
# Strongly Agree    Agree   Disagree Strongly Disagree 
#    0     1     2     3 

call_scoreraw(1:2, c) 
#  Strongly Agree Agree Disagree Strongly Disagree 
#[1,]    3  2  1     0 
#[2,]    0  1  2     3 

注意与价值观的载体n返回matrix类的一个对象,如果你想,你可以强制到data.frame呼叫。

res <- call_scoreraw(1:2, c) 
res2 <- as.data.frame(res) 
+0

如果您想将此call_scoreraw函数应用于arbiturary数量的列,那么该怎么办? - 我想自动化这个功能,这样我就不需要手动索引我想要选择的列。换句话说,隐式的FOR循环。 @RuiBarradas – MrReference

+0

我试了一下代码,它运行良好。对于我自己的学习,你能解释一下代码的这部分吗?(sapply(n,function(i)sapply(DF [[i]]),function(x)scoreraw(colnames(DF)[i]), (i),即这个部分......函数(i)sapply(DF [[i) ]],函数(x)scoreraw(colnames(DF)[i],x)) – MrReference

+0

@MrReference第一个'sapply'通过矢量'n'循环,第二个通过矢量' [I]]'。因此,对于''n'中的每个'i'和'DF [[i]]中的每个''''应用函数'scoreraw',它包含两个参数,一个列名和一个标量。 –