当唯一的区别是增量日期字段时,如何删除Redshift中的重复字段?
问题描述:
查看示例表格,其中日期1/3到1/5的唯一区别是日期字段。当唯一的区别是增量日期字段时,如何删除Redshift中的重复字段?
Date, ID, State, Amount 1/1/2017, 123, A, 1 1/2/2017, 123, B, 1 1/3/2017, 123, C, 1 1/4/2017, 123, C, 1 1/5/2017, 123, C, 1 1/6/2017, 345, B, 1
如何删除列ID,状态,金额的重复行,所以它看起来像这样?
Date, ID, State, Amount 1/1/2017, 123, A, 1 1/2/2017, 123, B, 1 1/3/2017, 123, C, 1 1/6/2017, 345, B, 1
答
select min(date) as date, id, state, amount
from some_table_name_you_never_disclosed
group by id, state, amount;
BTW,date
是SQL92保留关键字,所以你可能想改变这种列名。
答
您可以创建一个临时表和存储被删除重复记录作为
CREATE TEMPORARY TABLE temp
AS
SELECT DATE,
ID
FROM (SELECT DATE,
ID,
ROW_NUMBER() OVER (PARTITION BY ID,State,Amount ORDER BY DATE ASC) AS rnk
FROM your_table_name)
WHERE rnk > 1
然后使用delete语句作为
DELETE
FROM your_table_name USING temp
WHERE your_table_name.date = temp.date
AND your_table_name.id = temp.id