31

This query: select count(*) from planner_event takes a very long time to run - so long, I gave up and killed it before it finished. However, when I run explain select count(*) from planner_event, I can see a column in the output with the number of rows (14m).

How come explain can get the number of rows instantly, but count(*) takes a long time to run?

Benubird
  • 495
  • 1
  • 5
  • 8

2 Answers2

31

Explain is using previously gathered statistics (used by the query optimizer). Doing a select count(*) reads EVERY data block.

Here's a cheap way to get an estimated row count:

SELECT table_rows
FROM information_schema.tables
WHERE table_name='planner_event';

Even if you did select count(id), it might still take a very long time, unless you have a secondary index on id (also assuming id is a PRIMARY KEY). Because all data (including Row Data) is stored in B-Tree indexes, performing a select count(PK_COLUMN) is still a considerable amount of IO (needs to reads all data pages). If you have a secondary index on the PK field, it will be able to perform less IO to perform a count.

ijoseph
  • 105
  • 4
Kevin Bott
  • 636
  • 5
  • 9
5

Explain gets the number from some "statistics" that are used to estimate things for the Optimizer. That number can be far from correct -- I sometimes see it being more than a factor of 2 (higher or lower) than the exact value.

Performing the COUNT(*) on an InnoDB table must scan the table to avoid miscounting records that are busy being inserted/deleted by other connections but not yet "committed". Actually, it is good enough to do a full scan on some index, not necessarily the whole table (which contains the PRIMARY KEY).

How much RAM do you have? What is the value of innodb_buffer_pool_size? It might help if that were about 70% of RAM.

Rick James
  • 80,479
  • 5
  • 52
  • 119