SQL重复记录查询_删除
SQL重复记录查询
前言:查询重复,首先需要清楚什么是重复的数据,我们在数据库中如何定义记录为重复的!!!
在数据库中,一般是指两个意义上的重复记录,一是完全重复的记录,也即所有字段均重复的记录;二是部分关键字段重复的记录,比如Name字段重复,而其他字段不一定重复或都重复可以忽略。
针对上面的第一种重复情况,比较容易解决,使用
select distinct * from tableName
就可以得到无重复记录的结果集。
针对上面的第二种重复情况,又可以细分为单字段重复则认定为重复和多个字段重复,才认定为重复!!!下面开始详细讲解第二种情况的查询与删除重复数据的操作:
1. 查找表中多余的重复记录,重复记录是根据单个字段(peopleId)来判断的
代码001:
SELECT * FROM people WHERE peopleId IN (SELECT peopleId FROM people GROUP BY peopleId HAVING COUNT(peopleId) > 1)
图片001:
2. 删除表中多余的重复记录,重复记录是根据单个字段(peopleId)来判断,只留有rowid最小的记录
代码002:
> DELETE FROM people WHERE peopleId IN (SELECT
> peopleId FROM
> people GROUP BY peopleId HAVING COUNT(peopleId) > 1) AND rowid NOT IN (SELECT
> MIN(rowid) FROM
> people GROUP BY peopleId HAVING COUNT(peopleId) > 1)
图片002:
3. 查找表中多余的重复记录(多个字段)
代码003:
SELECT * FROM vitae a WHERE (a.peopleId, a.seq) IN (SELECT
peopleId,
seq FROM
vitae GROUP BY peopleId,
seq HAVING COUNT(*) > 1)
图片003:
4. 删除表中多余的重复记录(多个字段),只留有rowid最小的记录
代码004:
DELETE FROM vitae a WHERE (a.peopleId, a.seq) IN (SELECT peopleId, seq FROM vitae GROUP BY peopleId, seq HAVING COUNT(*) > 1) AND rowid NOT IN (SELECT MIN(rowid) FROM vitae GROUP BY peopleId, seq HAVING COUNT(*) > 1)
图片004:
5. 查找表中多余的重复记录(多个字段),不包含rowid最小的记录
代码005:
SELECT * FROM vitae a WHERE (a.peopleId, a.seq) IN (SELECT peopleId, seq FROM vitae GROUP BY peopleId, seq HAVING COUNT(*) > 1) AND rowid NOT IN (SELECT MIN(rowid) FROM vitae GROUP BY peopleId, seq HAVING COUNT(*) > 1)
图片005: