在列中将NA替换为相邻列中的值
此问题与具有类似标题(replace NA in an R vector with adjacent values)的帖子有关。我想扫描数据框中的一列,并用相邻单元格中的值替换NA。在前面提到的文章中,解决方案是用不是来自相邻向量(例如数据矩阵中的相邻元素)的值代替NA,而是取代固定值的条件。下面是我的问题的重复的例子:在列中将NA替换为相邻列中的值
UNIT <- c(NA,NA, 200, 200, 200, 200, 200, 300, 300, 300,300)
STATUS <-c('ACTIVE','INACTIVE','ACTIVE','ACTIVE','INACTIVE','ACTIVE','INACTIVE','ACTIVE','ACTIVE',
'ACTIVE','INACTIVE')
TERMINATED <- c('1999-07-06' , '2008-12-05' , '2000-08-18' , '2000-08-18' ,'2000-08-18' ,'2008-08-18',
'2008-08-18','2006-09-19','2006-09-19' ,'2006-09-19' ,'1999-03-15')
START <- c('2007-04-23','2008-12-06','2004-06-01','2007-02-01','2008-04-19','2010-11-29','2010-12-30',
'2007-10-29','2008-02-05','2008-06-30','2009-02-07')
STOP <- c('2008-12-05','4712-12-31','2007-01-31','2008-04-18','2010-11-28','2010-12-29','4712-12-31',
'2008-02-04','2008-06-29','2009-02-06','4712-12-31')
TEST < - data.frame(UNIT,状态,终止,启动,停止) TEST
UNIT STATUS TERMINATED START STOP
1 NA ACTIVE 1999-07-06 2007-04-23 2008-12-05
2 NA INACTIVE 2008-12-05 2008-12-06 4712-12-31
3 200 ACTIVE 2000-08-18 2004-06-01 2007-01-31
4 200 ACTIVE 2000-08-18 2007-02-01 2008-04-18
5 200 INACTIVE 2000-08-18 2008-04-19 2010-11-28
6 200 ACTIVE 2008-08-18 2010-11-29 2010-12-29
7 200 INACTIVE 2008-08-18 2010-12-30 4712-12-31
8 300 ACTIVE 2006-09-19 2007-10-29 2008-02-04
9 300 ACTIVE 2006-09-19 2008-02-05 2008-06-29
10 300 ACTIVE 2006-09-19 2008-06-30 2009-02-06
11 300 INACTIVE 1999-03-15 2009-02-07 4712-12-31
#using the syntax for a conditional replace and hoping it works :/
TEST$UNIT[is.na(TEST$UNIT)] <- TEST$STATUS; TEST
UNIT STATUS TERMINATED START STOP
1 1 ACTIVE 1999-07-06 2007-04-23 2008-12-05
2 2 INACTIVE 2008-12-05 2008-12-06 4712-12-31
3 200 ACTIVE 2000-08-18 2004-06-01 2007-01-31
4 200 ACTIVE 2000-08-18 2007-02-01 2008-04-18
5 200 INACTIVE 2000-08-18 2008-04-19 2010-11-28
6 200 ACTIVE 2008-08-18 2010-11-29 2010-12-29
7 200 INACTIVE 2008-08-18 2010-12-30 4712-12-31
8 300 ACTIVE 2006-09-19 2007-10-29 2008-02-04
9 300 ACTIVE 2006-09-19 2008-02-05 2008-06-29
10 300 ACTIVE 2006-09-19 2008-06-30 2009-02-06
11 300 INACTIVE 1999-03-15 2009-02-07 4712-12-31
结果应该是:
UNIT STATUS TERMINATED START STOP
1 ACTIVE ACTIVE 1999-07-06 2007-04-23 2008-12-05
2 INACTIVE INACTIVE 2008-12-05 2008-12-06 4712-12-31
3 200 ACTIVE 2000-08-18 2004-06-01 2007-01-31
4 200 ACTIVE 2000-08-18 2007-02-01 2008-04-18
5 200 INACTIVE 2000-08-18 2008-04-19 2010-11-28
6 200 ACTIVE 2008-08-18 2010-11-29 2010-12-29
7 200 INACTIVE 2008-08-18 2010-12-30 4712-12-31
8 300 ACTIVE 2006-09-19 2007-10-29 2008-02-04
9 300 ACTIVE 2006-09-19 2008-02-05 2008-06-29
10 300 ACTIVE 2006-09-19 2008-06-30 2009-02-06
11 300 INACTIVE 1999-03-15 2009-02-07 4712-12-31
它没有工作,因为地位是一个因素。当您将因素与数字混合时,数字是限制最少的。通过强制状态为字符,你得到你想要的是结果,现在列是一个字符向量:
TEST$UNIT[is.na(TEST$UNIT)] <- as.character(TEST$STATUS[is.na(TEST$UNIT)])
## UNIT STATUS TERMINATED START STOP
## 1 ACTIVE ACTIVE 1999-07-06 2007-04-23 2008-12-05
## 2 INACTIVE INACTIVE 2008-12-05 2008-12-06 4712-12-31
## 3 200 ACTIVE 2000-08-18 2004-06-01 2007-01-31
## 4 200 ACTIVE 2000-08-18 2007-02-01 2008-04-18
## 5 200 INACTIVE 2000-08-18 2008-04-19 2010-11-28
## 6 200 ACTIVE 2008-08-18 2010-11-29 2010-12-29
## 7 200 INACTIVE 2008-08-18 2010-12-30 4712-12-31
## 8 300 ACTIVE 2006-09-19 2007-10-29 2008-02-04
## 9 300 ACTIVE 2006-09-19 2008-02-05 2008-06-29
## 10 300 ACTIVE 2006-09-19 2008-06-30 2009-02-06
## 11 300 INACTIVE 1999-03-15 2009-02-07 4712-12-31
比我快6秒。 +1(我正在删除我的)。 – A5C1D2H2I1M1N2O1R2T1 2013-03-26 05:25:56
好东西它是代码而不是手枪:) – 2013-03-26 05:26:29
谢谢你们!那个伎俩 – 2013-03-26 05:53:24
你要做
TEST$UNIT[is.na(TEST$UNIT)] <- TEST$STATUS[is.na(TEST$UNIT)]
以使该值将与被替换相邻值。否则,要替换的值的数量与要替换的值之间不匹配。这将导致值按行顺序被替换。它在这种情况下起作用,因为两个值被替换为前两个值。
我认为这是可以作为答案。当然,解决方案与其他人给出的解决方案相同,但是您已经添加了对正在发生的事情的解释。在我看来,它不应该是一个评论。 – 2016-08-31 16:01:14
也许试试'TEST $ UNIT [is.na(TEST $ UNIT)] Seth 2013-03-26 05:23:17
您不能在数据框中混合列中的类型。 – 2013-03-26 05:24:12