51

I am having performance issues on certain database queries that have large possible result sets.

The query in question, I have three ANDs in the WHERE clause

Does the order of the clauses matter?

As in, if I put the ASI_EVENT_TIME clause first (since that would remove the most of the results out of any of the clauses.

Will that improve the run time on the query?

QUERY:

SELECT DISTINCT  activity_seismo_info.* 
FROM `activity_seismo_info` 
WHERE 
    activity_seismo_info.ASI_ACTIVITY_ID IS NOT NULL  AND 
    activity_seismo_info.ASI_SEISMO_ID IN (43,44,...,259) AND 
    (
        activity_seismo_info.ASI_EVENT_TIME>='2011-03-10 00:00:00' AND 
        activity_seismo_info.ASI_EVENT_TIME<='2011-03-17 23:59:59'
    ) 

ORDER BY activity_seismo_info.ASI_EVENT_TIME DESC

EXPLAIN of query:

+----+-------------+---------+-------+---------------------------+--------------+---------+------+-------+-----------------------------+ 
| id | select_type | table   | type  | possible_keys             | key          | key_len | ref  | rows  | Extra                       |
+----+-------------+---------+-------+---------------------------+--------------+---------+------+-------+-----------------------------+ 
|  1 | SIMPLE      | act...o | range | act...o_FI_1,act...o_FI_2 | act...o_FI_1 | 5       | NULL | 65412 | Using where; Using filesort |
+----+-------------+---------+-------+---------------------------+--------------+---------+------+-------+-----------------------------+

Using:

PHP 5.2

MySQL 5.0.51a-3ubuntu5.4

Propel 1.3

Symfony 1.2.5

RolandoMySQLDBA
  • 185,223
  • 33
  • 326
  • 536
Patrick
  • 4,329
  • 7
  • 29
  • 28

6 Answers6

31

I do not think so. The query optimizer should be clever enough.

You can try rearranging the WHERE clauses and see that EXPLAINS tells you the same in each case.


About what can be done to optimize this query: Is there an index on ASI_EVENT_TIME ? (this is the most crucial I think for this query as you also sort the results using it).

Are there indexes on the other two fields (ASI_SEISMO_ID and ASI_ACTIVITY_ID)?

It would be helpful if you posted the table structure.

ypercubeᵀᴹ
  • 99,450
  • 13
  • 217
  • 306
18

From the documentation:

If the table has a multiple-column index, any leftmost prefix of the index can be used by the optimizer to find rows. For example, if you have a three-column index on (col1, col2, col3), you have indexed search capabilities on (col1), (col1, col2), and (col1, col2, col3).

MySQL cannot use an index if the columns do not form a leftmost prefix of the index.

So yes, it should be the same as the order of the columns in a compound index.

Gaius
  • 11,238
  • 3
  • 32
  • 64
10

No, it doesn't matter.

The optimizer does a bunch of simple transformations straight after it parses the SQL - this is one of them.

Morgan Tocker
  • 3,810
  • 22
  • 24
8

WHERE foo AND bar

optimizes the same as

WHERE bar AND foo

However,

WHERE non-equal#1 AND non-equal#2

Cannot optimize both parts. For example,

WHERE a BETWEEN 1 and 3 AND b > 17

cannot make good use of INDEX(a,b) or INDEX(b,a)

To phrase it differently, any '=' tests AND'd together in the WHERE clause are used first, then one non-'=' (IN, BETWEEN, >, etc) can be handled. No more than one can be effectively optimized.

Your query has 3 such clauses.

As it turns out, INDEX(EVENT_TIME) is probably the most useful -- it will help with one of the ANDs, and it might be used to avoid "filesort" for the ORDER BY.

If there are no duplicate rows (why the heck would there be?), then get rid of DISTINCT. That causes even more effort.

Please provide SHOW CREATE TABLE and SHOW TABLE STATUS when asking performance questions.

Update... Newer versions (eg, MySQL 5.7) can, in some situations, treat IN( list of constants ) nearly like =. To play it safe, stick with this order (each part being optional):

  1. Any number of =.
  2. Some INs.
  3. At most one range.
Rick James
  • 80,479
  • 5
  • 52
  • 119
1

MySQL where optimization doc says :

You might be tempted to rewrite your queries to make arithmetic operations faster, while sacrificing readability. Because MySQL does similar optimizations automatically, you can often avoid this work, and leave the query in a more understandable and maintainable form. Some of the optimizations performed by MySQL follow:

  • ...

  • For each table in a join, a simpler WHERE is constructed to get a fast WHERE evaluation for the table and also to skip rows as soon as possible.

  • Each table index is queried, and the best index is used unless the optimizer believes that it is more efficient to use a table scan. At one time, a scan was used based on whether the best index spanned more than 30% of the table, but a fixed percentage no longer determines the choice between using an index or a scan. The optimizer now is more complex and bases its estimate on additional factors such as table size, number of rows, and I/O block size.

This way it is rational for the query optimizer to omit HOW-order we used the columns in the query (Not only MySQL but SQL is a declarative language and must do what we want not how we want).

However I still love to have same sort for the columns of a composite key in the query but it is sometimes inevitable for example when we use ORM or ActiveRecord, in some frameworks such as yii2, customizing the relation criteria will be appended to the end of an "on" condition but we still need the capabilities of QueryBuilders in different parts of an application.

-2

ANY field that is used in your WHERE/HAVING clauses and has high selectivity (the number of unique values / the total number of records > 10%~20%) MUST be indexed.

So, if your ASI_EVENT_TIME column has many possible values, first index them all. Then as @ypercube told, try rearranging them and see what EXPLAIN tells you. Should be all around the same.

Additionally, want you to have a look at Indexing SQL LIKE Filters. Though it is not what you need an answer for, but you will still learn about how indexing works under the hood.

*Edit: Refer to the links provided below in the comments to learn more about indexing.

esengineer
  • 535
  • 5
  • 10