Finding Missing Values In A MySQL Table

If you have a table of incremental values it can be hard to find out which ones are missing. The only solution might be to write a script to get all the data from the database and see which ones are missing. However, there is a way of doing this without using a script.

Using a standard select query like this:

SELECT * FROM table;

Gets the following data:

1
3
10
23

We can see that values are missing, but which ones? The following query will show us where the gaps are in the data of the table.

SELECT t1.id+1 as Missing
FROM table as t1
LEFT JOIN table  as t2 ON t1.id+1 = t2.id
WHERE t2.id IS NULL
ORDER BY t1.id;

Produces the following result.

2
4
11
24

However, this only tell us where the gaps are, not how long they are. To get the range of where the gaps from and to we need to do something a little more complex.

SELECT
t1.id+1 AS 'Missing From',
MIN(t2.id) - 1 AS 'To'
FROM table AS t1, table AS t2
WHERE t1.id < t2.id
GROUP BY t1.id
HAVING t1.id < MIN(t2.id) - 1;

This query gives the following result.

Missing FromTo
22
49
1122

Using this dataset we can figure out where the gaps in the data are and perhaps do something with them.

Comments

Nice Work! My MySql keeps pulling up an error on the last SQL statement. There error is: #1054 - Unknown column 't1.id' in 'having clause' Versions are: phpMyAdmin - 2.11.5.2 MySQL client version: 5.0.45 Any ideas?
Permalink
Odd this... It works in MySQL query browser, so I thought it must be an issue with phpMyAdmin. And it was! For some reason it has a problem with the table.column reference in the having command. I found the solution was to add quotes around the name, like this:
HAVING 't1.ID'  MIN(t2.ID) - 1
} Let us know if that works!
Name
Philip Norton
Permalink
Great example! Thanks!
Permalink

Very nice. I tweaked your example a little to find records that are missing by date

CREATE  TABLE IF NOT EXISTS `sequence` (
 `id` BIGINT NOT NULL AUTO_INCREMENT ,
 `date` DATETIME NOT NULL ,
 PRIMARY KEY (`id`)
)
ENGINE = InnoDB;



# missing 10/2, 10/12, and 10/13

INSERT INTO sequence (date) values ('2011-10-01 00:00:00');
INSERT INTO sequence (date) values ('2011-10-03 00:00:00');
INSERT INTO sequence (date) values ('2011-10-04 00:00:00');
INSERT INTO sequence (date) values ('2011-10-05 00:00:00');
INSERT INTO sequence (date) values ('2011-10-06 00:00:00');
INSERT INTO sequence (date) values ('2011-10-07 00:00:00');
INSERT INTO sequence (date) values ('2011-10-08 00:00:00');
INSERT INTO sequence (date) values ('2011-10-09 00:00:00');
INSERT INTO sequence (date) values ('2011-10-10 00:00:00');
INSERT INTO sequence (date) values ('2011-10-11 00:00:00');
INSERT INTO sequence (date) values ('2011-10-14 00:00:00');
INSERT INTO sequence (date) values ('2011-10-15 00:00:00');
INSERT INTO sequence (date) values ('2011-10-16 00:00:00');
INSERT INTO sequence (date) values ('2011-10-17 00:00:00');
INSERT INTO sequence (date) values ('2011-10-18 00:00:00');
INSERT INTO sequence (date) values ('2011-10-19 00:00:00');
INSERT INTO sequence (date) values ('2011-10-20 00:00:00');


mysql&gt; SELECT
   -> DATE_ADD(t1.date, INTERVAL 1 DAY) AS 'Missing From',
   -> DATE_ADD( MIN(t2.date), INTERVAL -1 DAY) AS 'To'
   -> FROM sequence AS t1, sequence AS t2
   -> WHERE t1.date  GROUP BY t1.date
   -> HAVING t1.date < DATE_ADD( MIN(t2.date), INTERVAL -1 DAY);
+---------------------+---------------------+
| Missing From        | To                  |
+---------------------+---------------------+
| 2011-10-02 00:00:00 | 2011-10-02 00:00:00 |
| 2011-10-12 00:00:00 | 2011-10-13 00:00:00 |
+---------------------+---------------------+
2 rows in set (0.00 sec)

Permalink

That's awesome! I'm sure I'll have a use for that one day. 

Thanks for your input Andy :)

Name
Philip Norton
Permalink

Add new comment

The content of this field is kept private and will not be shown publicly.