0

When we show a table with many thousands of records, we only show a small part of the total result set so as not to send too much data to the client at once. Clients can request more parts of the result set through pagination.

This is how a typical query for such a table looks:

SELECT `expression1`, `expression2` FROM table (JOINS) ORDER BY table.id LIMIT 0,100

The LIMIT clause restricts the result set so that the table does not become too big. The ORDER BY clause enforces a deterministic order.

I have noticed that these queries can become very slow when several database tables are joined and at least one of them contains a large number of rows. Leaving out the ORDER BY clause greatly improves the speed, up to 10000%. Apart from not having to sort, I assume the query optimizes by executing the LIMIT clause before the SELECT clause.

Ironically, it is not necessary that the result set is ordered by id. We just need a deterministic order to facilitate reliable pagination, for example one that follows the order of the primary key or a predetermined index.

Is there an instruction to achieve deterministic order without the ORDER BY clause?

user2180613
  • 1,792

1 Answers1

8

No

You cannot and should not rely on an SQL database returning rows in any deterministic fashion without proper ORDER BY clauses. In practice, rows may come back in a particular order each time you run the same query. Simply because its looking at the same data in the same order on disk. But there's no guarantee of it. Various things going on on the database server can change that physical order.

What you need to do is make sure you have the proper indexes set up. Use EXPLAIN and other tools to minimize row scans.

GrandmasterB
  • 39,412