21

I'm a newbie DBA, and I have experience in Microsoft SQL Server but I want to jump to FLOSS.

I'm starting a company, and we develop an app (PHP) with a Postgres backend, and we did some tests comparing with MySQL too. We observe that MySQL is twice as fast as PostgreSQL.

I did a tangible performance test:

  • Same columns in table with equivalent column datatypes.
  • Same number of rows.
  • Same indexes in both (primary key included).
  • The CPU load are idle and Postgres machine it's significantly better.
  • And the same query (obviously).

What am I doing wrong?

P.S: I read many "howtos" on performance tuning for database engines.
P.S(2): We're using InnoDB (one file per table) on the MySQL database.


Hi Mat!

I did the three common select (and hardest) queries.

The question about disk, certainly it's not the same; In Postgres it's a SSD (almost three time fastest).

MySQL cache data:

+------------------------------+----------------------+
| Variable_name                | Value                |
+------------------------------+----------------------+
| binlog_cache_size            | 32768                |
| have_query_cache             | YES                  |
| key_cache_age_threshold      | 300                  |
| key_cache_block_size         | 1024                 |
| key_cache_division_limit     | 100                  |
| max_binlog_cache_size        | 18446744073709547520 |
| query_cache_limit            | 1048576              |
| query_cache_min_res_unit     | 4096                 |
| query_cache_size             | 16777216             |
| query_cache_type             | ON                   |
| query_cache_wlock_invalidate | OFF                  |
| table_definition_cache       | 256                  |
| table_open_cache             | 64                   |
| thread_cache_size            | 8                    |
+------------------------------+----------------------+

I don't know how to view this in PostgreSQL.

Thanks in advance.

RolandoMySQLDBA
  • 185,223
  • 33
  • 326
  • 536
Javier Valencia
  • 313
  • 1
  • 3
  • 7

1 Answers1

44

MySQL and PostgreSQL are quite difference performance-wise. InnoDB and PostgreSQL tables are optimized for different sorts of queries. Understanding these differences is important to understanding how to get good performance out of either.

As an example, let's look at the most obvious difference.

PostgreSQL vs MySQL/InnoDB Table Structure and What This Means for Performance

In general, on complex work-loads, PostgreSQL will be faster, but on simple primary key lookups MySQL with InnoDB will be faster.

PostgreSQL tables are heap tables. There is no option to build a table which is not a heap table. The cluster command simply rewrites the heap ordered by a specified index. Indexes then provide heap locations for tuples with various values. Indexes cannot be traversed in physical order, only logical order so they have a lot of random disk I/O while reading a table sequentially usually means a lot of sequential disk I/O, since you can read a table in physical order. Sequential disk I/O gets to use read-ahead cache and some other OS-level optimization.

What this means is that if you need a significant portion of records or over a few pages, it is usually faster to just read the pages from disk. On the other hand, a primary key lookup for a table requires hitting the index, lookup up the location in the file then hit the heap table and pull the record. This means a number of pieces of random disk I/O.

InnoDB uses a different approach. With InnoDB, the table is a b-tree index with the actual data in the index payload. This means that a primary key lookup already gets to pull the data from the leaf page, and so less random disk I/O is required for this. At the same time, an index scan requires traversing two indexes instead of one, meaning use of any index other than the primary key ends up being slower and sequential scans are slower still.

Getting Diagnoses in PostgreSQL

I think you want to use something like:

 EXPLAIN (analyse, buffers, verbose)
 [query];

That will give you the query plan, initial estimates, actual times, buffer usage, and much more.

Chris Travers
  • 13,112
  • 51
  • 95