填充的NULL-S的空白表中的平均数值
我有一个领域(id,letter,date)
和一些数据的表吧:填充的NULL-S的空白表中的平均数值
1 A 2012-01-01
2 B NULL
3 C NULL
4 D 2012-01-15
我想最近的非平均日期,以填补空值-NULL值。这样的:
1 A 2012-01-01
2 B 2012-01-08
3 C 2012-01-08
4 D 2012-01-15
或者,也许,即使这样的:
1 A 2012-01-01
2 B 2012-01-08
3 C 2012-01-11
4 D 2012-01-15
两种变体是巨大的。有没有简单的方法在MySQL中实现它?
在此先感谢
UPD表是相当大的,约700.000记录,而像描述的概念约50.000差距。
UPD2有点清洁器:表可以是这样的:
1 A 2012-01-01
2 B NULL
3 C NULL
4 D 2012-01-15
5 E NULL
6 F 2012-01-17
7 G NULL
8 H NULL
9 I 2012-01-20
预期的结果是这样的:
1 A 2012-01-01
2 B **2012-01-08**
3 C **2012-01-08**
4 D 2012-01-15
5 E **2012-01-16**
6 F 2012-01-17
7 G **2012-01-18**
8 H **2012-01-18**
9 I 2012-01-20
(星号是要注意变更值)。谢谢
UPD3感谢所有人。但我会用另一种方式来做,用一个简单的公式计算日期:needed_date = [(max(date)-min(date))/(max(id)-min(id)] *(my_ID-min(id ))+最小值(日期)
假设你有一个表称为T
这样的:
CREATE TABLE T(
id INT,
time DATETIME
);
下面的查询会给你的界限:
SELECT T.Id
, MAX(T1.Time) as MinDate
, MIN(T2.Time) as MaxDate
FROM T
INNER JOIN T T1 ON T1.Id < T.Id
AND T.time IS NULL
AND NOT T1.time IS NULL
INNER JOIN T T2 ON T2.id > T.id
AND T.time IS NULL
AND NOT T2.time IS NULL
GROUP BY Id
输出将成为:
Id MinDate MaxDate
2 2012-01-01 2012-01-15
3 2012-01-01 2012-01-15
因此,下一步将做使用此结果集,平均为实例,以更新的NULL值的更新..
UPDATE T
INNER JOIN
(
SELECT T.Id, MAX(T1.Time) as MinTime, MIN(T2.Time) as MaxTime
FROM T
INNER JOIN T T1 ON T1.id < T.id
AND T.time IS NULL
AND NOT T1.time IS NULL
INNER JOIN T T2 ON T2.id > T.id
AND T.time IS NULL
AND NOT T2.time IS NULL
GROUP BY T.ID) T3
ON T3.id = T.id
SET T.time = FROM_UNIXTIME((UNIX_TIMESTAMP(T3.MinTime) + UNIX_TIMESTAMP(T3.MaxTime))/2)
WHERE T.time IS NULL
大于它是一个解决方案。但它'EXPLAIN'约700.000记录的表也不是那么好:( – 2013-03-22 18:26:46
QUERY#1
SELECT id,letter,IFNULL(date,dt) date FROM mytable,
(SELECT DATE(mindate + INTERVAL (secdiff/2) SECOND) dt
FROM (SELECT mindate,UNIX_TIMESTAMP(maxdate)
- UNIX_TIMESTAMP(mindate) secdiff
FROM (SELECT MIN(date) mindate FROM mytable) N,
(SELECT MAX(date) maxdate FROM mytable) X) AA) A;
样本数据
mysql> DROP TABLE IF EXISTS mytable;
Query OK, 0 rows affected (0.00 sec)
mysql> CREATE TABLE mytable
-> (
-> id int not null auto_increment,
-> letter char(1),
-> `date` date,
-> primary key (id)
->);
Query OK, 0 rows affected (0.07 sec)
mysql> INSERT INTO mytable (letter,date) VALUES
-> ('A','2012-01-01'),('B',NULL),('C',NULL),('D','2012-01-15');
Query OK, 4 rows affected (0.00 sec)
Records: 4 Duplicates: 0 Warnings: 0
mysql> SELECT * FROM mytable;
+----+--------+------------+
| id | letter | date |
+----+--------+------------+
| 1 | A | 2012-01-01 |
| 2 | B | NULL |
| 3 | C | NULL |
| 4 | D | 2012-01-15 |
+----+--------+------------+
4 rows in set (0.00 sec)
mysql>
QUERY#1中执行
mysql> SELECT id,letter,IFNULL(date,dt) date FROM mytable,
-> (SELECT DATE(mindate + INTERVAL (secdiff/2) SECOND) dt
-> FROM (SELECT mindate,UNIX_TIMESTAMP(maxdate)
-> - UNIX_TIMESTAMP(mindate) secdiff
-> FROM (SELECT MIN(date) mindate FROM mytable) N,
-> (SELECT MAX(date) maxdate FROM mytable) X) AA) A;
+----+--------+------------+
| id | letter | date |
+----+--------+------------+
| 1 | A | 2012-01-01 |
| 2 | B | 2012-01-08 |
| 3 | C | 2012-01-08 |
| 4 | D | 2012-01-15 |
+----+--------+------------+
4 rows in set (0.00 sec)
mysql>
QUERY#2(清洁版)
此查询使用UNIX时间戳记的平均值。如果所有的日期是NULL,它使用今天的日期:
SELECT id,letter,IFNULL(date,dt) date FROM mytable,
(
SELECT IF(K=0,DATE(NOW()),avgdt) dt FROM
(SELECT DATE(FROM_UNIXTIME(AVG(UNIX_TIMESTAMP(date))))
avgdt FROM mytable) AA,
(SELECT COUNT(date) K FROM mytable) BB
) A;
QUERY#2中执行
mysql> SELECT id,letter,IFNULL(date,dt) date FROM mytable,
-> (
-> SELECT IF(K=0,DATE(NOW()),avgdt) dt FROM
-> (SELECT DATE(FROM_UNIXTIME(AVG(UNIX_TIMESTAMP(date))))
-> avgdt FROM mytable) AA,
-> (SELECT COUNT(date) K FROM mytable) BB
->) A;
+----+--------+------------+
| id | letter | date |
+----+--------+------------+
| 1 | A | 2012-01-01 |
| 2 | B | 2012-01-08 |
| 3 | C | 2012-01-08 |
| 4 | D | 2012-01-15 |
+----+--------+------------+
4 rows in set (0.05 sec)
mysql>
试试看!
感谢。但是,这将改变我所有的NULL-S非空值的第一对之间的平均日期。请参见更新的问题:( – 2013-03-22 19:08:51
看看第一组数据,如果你有其他的记录,像'5,E,NULL','6,F,2012-01-20',会是怎样的结果呢? – 2013-03-22 18:05:12
*你为什么要操纵数据?检索记录时应该进行此计算。 – Kermit 2013-03-22 18:05:32
什么是记录的顺序和字段的值之间的相关性(即会一直为B在时间之前)? – 2013-03-22 18:07:50