SQL重复记录查询_删除

SQL重复记录查询

前言:查询重复,首先需要清楚什么是重复的数据,我们在数据库中如何定义记录为重复的!!!

在数据库中,一般是指两个意义上的重复记录,一是完全重复的记录,也即所有字段均重复的记录;二是部分关键字段重复的记录,比如Name字段重复,而其他字段不一定重复或都重复可以忽略。

针对上面的第一种重复情况,比较容易解决,使用

select distinct * from tableName

就可以得到无重复记录的结果集。

针对上面的第二种重复情况,又可以细分为单字段重复则认定为重复和多个字段重复,才认定为重复!!!下面开始详细讲解第二种情况的查询与删除重复数据的操作:

1. 查找表中多余的重复记录,重复记录是根据单个字段(peopleId)来判断的

代码001:

 SELECT 
  * 
FROM
  people 
WHERE peopleId IN 
  (SELECT 
    peopleId 
  FROM
    people 
  GROUP BY peopleId 
  HAVING COUNT(peopleId) > 1)

图片001:
SQL重复记录查询_删除

2. 删除表中多余的重复记录,重复记录是根据单个字段(peopleId)来判断,只留有rowid最小的记录

代码002:

> DELETE  FROM   people  WHERE peopleId IN    (SELECT 
>     peopleId    FROM
>     people    GROUP BY peopleId    HAVING COUNT(peopleId) > 1)    AND rowid NOT IN    (SELECT 
>     MIN(rowid)    FROM
>     people    GROUP BY peopleId    HAVING COUNT(peopleId) > 1)

图片002:

SQL重复记录查询_删除

3. 查找表中多余的重复记录(多个字段)

代码003:

SELECT * FROM vitae a WHERE (a.peopleId, a.seq) IN (SELECT
peopleId,
seq FROM
vitae GROUP BY peopleId,
seq HAVING COUNT(*) > 1)

图片003:
SQL重复记录查询_删除

4. 删除表中多余的重复记录(多个字段),只留有rowid最小的记录

代码004:

DELETE 
FROM
  vitae a 
WHERE (a.peopleId, a.seq) IN 
  (SELECT 
    peopleId,
    seq 
  FROM
    vitae 
  GROUP BY peopleId,
    seq 
  HAVING COUNT(*) > 1) 
  AND rowid NOT IN 
  (SELECT 
    MIN(rowid) 
  FROM
    vitae 
  GROUP BY peopleId,
    seq 
  HAVING COUNT(*) > 1)

图片004:
SQL重复记录查询_删除

5. 查找表中多余的重复记录(多个字段),不包含rowid最小的记录

代码005:

SELECT 
  * 
FROM
  vitae a 
WHERE (a.peopleId, a.seq) IN 
  (SELECT 
    peopleId,
    seq 
  FROM
    vitae 
  GROUP BY peopleId,
    seq 
  HAVING COUNT(*) > 1) 
  AND rowid NOT IN 
  (SELECT 
    MIN(rowid) 
  FROM
    vitae 
  GROUP BY peopleId,
    seq 
  HAVING COUNT(*) > 1)

图片005:
SQL重复记录查询_删除