如何在数据库表中查找重复条目?
下面的查询将显示已复制的“书”表中的所有杜威十进制数:如何在数据库表中查找重复条目?
SELECT dewey_number,
COUNT(dewey_number) AS NumOccurrences
FROM book
GROUP BY dewey_number
HAVING (COUNT(dewey_number) > 1)
然而,我想要做的是有我的查询与显示相关的作者的名字重复的条目(“书”表和“作者”表由“author_id”连接)。换句话说,上述查询将产生以下类似以下
dewey_number | NumOccurrences
------------------------------
5000 | 2
9090 | 3
我想要什么结果来显示的东西:
author_last_name | dewey_number | NumOccurrences
-------------------------------------------------
Smith | 5000 | 2
Jones | 5000 | 2
Jackson | 9090 | 3
Johnson | 9090 | 3
Jeffers | 9090 | 3
任何帮助,您可以提供非常赞赏。而且,如果它起作用,我使用Postgresql DB。
UPDATE请注意,“author_last_name”不在“book”表中。
嵌套查询可以做的工作。
SELECT author_last_name, dewey_number, NumOccurrences
FROM author INNER JOIN
(SELECT author_id, dewey_number, COUNT(dewey_number) AS NumOccurrences
FROM book
GROUP BY author_id, dewey_number
HAVING (COUNT(dewey_number) > 1)) AS duplicates
ON author.id = duplicates.author_id
(我不知道这是不是达到你想要的最快的方法)
更新:这是我的数据
SELECT * FROM author;
id | author_last_name
----+------------------
1 | Fowler
2 | Knuth
3 | Lang
SELECT * FROM book;
id | author_id | dewey_number | title
----+-----------+--------------+------------------------
1 | 1 | 600 | Refactoring
2 | 1 | 600 | Refactoring
3 | 1 | 600 | Analysis Patterns
4 | 2 | 600 | TAOCP vol. 1
5 | 2 | 600 | TAOCP vol. 1
6 | 2 | 600 | TAOCP vol. 2
7 | 3 | 500 | Algebra
8 | 3 | 500 | Undergraduate Analysis
9 | 1 | 600 | Refactoring
10 | 2 | 500 | Concrete Mathematics
11 | 2 | 500 | Concrete Mathematics
12 | 2 | 500 | Concrete Mathematics
这里是的结果上述查询:
author_last_name | dewey_number | numoccurrences
------------------+--------------+----------------
Fowler | 600 | 4
Knuth | 600 | 3
Knuth | 500 | 3
Lang | 500 | 2
好吧,可能是一个嵌套的查询是矫枉过正,但现在我确定它的工作原理。 Tony Andrews对author_id专栏的评论也适用于此。 – 2009-01-12 17:07:39
你可能想这
SELECT dewey_number, author_last_name,
COUNT(dewey_number) AS NumOccurrences
FROM book
GROUP BY dewey_number,author_last_name
HAVING (COUNT(dewey_number) > 1)
“author_last_name”不在“book”表中。 “书”表和“作者”表通过“book”表中的FKed“author_id”连接。 – Huuuze 2009-01-12 16:27:12
SELECT dewey_number, author_last_name,
COUNT(dewey_number) AS NumOccurrences
FROM book
JOIN author USING (author_id)
GROUP BY dewey_number,author_last_name
HAVING COUNT(dewey_number) > 1
如果书。 AUTHOR_ID可以为空,然后改变连接到:
LEFT OUTER JOIN author USING (author_id)
如果AUTHOR_ID列在每个表不同的名字,那么你不能使用通过,而是使用ON:
JOIN author ON author.id = book.author_id
或
LEFT OUTER JOIN author ON author.id = book.author_id
select author_name,dewey_number,Num_of_occur
from author a,(select author_id,dewey_number,count(dewey_number) Num_of_occur
from book
group by author_id,dewey_number
having count(dewey_number) > 1) dup
where a.author_id = dup.author_id
最简单的方式efective我发现是下面显示:
SELECT
p.id
, p.full_name
, (SELECT count(id) FROM tbl_documents as t where t.person_id = p.id) as rows
FROM tbl_people as p
WHERE
p.id
IN (SELECT d.person_id FROM tbl_documents as d
GROUP BY d.person_id HAVING count(d.id) > 1)
ORDER BY
p.full_name
重复http://*.com/questions/18932/sql-how-can-i-remove-duplicate-rows有人可以关闭吗? – 2009-01-12 16:28:29
请仔细阅读。这不是重复的。 – Huuuze 2009-01-12 16:41:59
根本不重复。他希望*找到重复的行(以及更多),而不是删除它们。 – 2009-01-12 16:45:03