45

I have a table that has a column called article_title. Let's say the table name is articles. I need to find out the records where the article_title data is the same on more than one record.

Here's what I've got:

select a.* 
from articles a 
where a.article_title = (select article_title 
                         from articles 
                         where article_title = a.article_title 
                         AND a.id <> articles.id)
Michael Green
  • 25,255
  • 13
  • 54
  • 100
somejkuser
  • 797
  • 3
  • 8
  • 15

3 Answers3

53

HAVING is a great aggregate filter. (http://dev.mysql.com/doc/refman/5.6/en/group-by-extensions.html) For example, select the article_titles with more than on occurrence:

SELECT count(*), article_title
FROM articles
GROUP BY article_title
HAVING COUNT(*) > 1;

Adding columns to the SELECT and GROUP BY clauses allow you to locate duplicates based on a composite key of multiple columns.

sqlreader
  • 756
  • 4
  • 6
19

Your problem can be solved with this query:

SELECT *
FROM article
WHERE article_title IN (SELECT *
                        FROM (SELECT article_title
                              FROM article
                              GROUP BY article_title
                              HAVING COUNT(article_title) > 1)
                        AS a);
Sicco
  • 103
  • 2
Abhik Dey
  • 289
  • 2
  • 6
4

have a table that has a column called article_title. Let's say the table name is articles. I need to find out the records where the article_title data is the same on more than one record.

Sound like to me you also need to have the id because you want to find records based on article_title because you have duplicates

Basic MIN/MAX with GROUP BY (you will miss id's when more then 2 duplicates)

SELECT 
  MIN(id) -- for FIFO id's (first id by duplicate)
, MAX(id) -- for LIFO id's (last id by duplicate)
, article_title
, COUNT(*)  
FROM
 articles
WHERE -- Maybe to filter out '' or IS NOT NULL
 article_title != '' AND article_title IS NOT NULL
GROUP BY
 article_title ASC
HAVING
 COUNT(*) >= 2
;

Or back to denormalisation to generate an CSV for LIFO id's (older id's by duplicates) but you know all id's here..

SELECT 
  GROUP_CONCAT(id ORDER BY ASC SEPARATOR ',') -- change to DESC if want the last record first
, article_title
, COUNT(*)  
FROM
 articles
GROUP BY
 article_title ASC
HAVING
 COUNT(*) >= 2
;
Raymond Nijland
  • 1,088
  • 7
  • 16