Extreme performance loss due to moving from Mysql 5.7.23 to Mysql 8.0 (status, config included)

Question

Been sitting on this for 12 hours straight (it's noon now, so I will read/respond when I woke up again).
I have made the big fault of recommending a database upgrade of our productive environment to increase performance.
We can NOT go back, it's nearly 6 terrabytes of storage and downgrades are not possible with mysql 8.0.
Also using a pre upgrade snapshot backup is no solution (days of work).

I would like to solve the horrible performance that resulted from switching server to latest version

A few details:
Environment: Linux Stretch on AWS i3.8xlarge (32 cpus, 240gb ram)
Server: Mysql 8.0 in mysql/ Docker container using bind mounts and host networking
Storage: Amazon AWS EBS IO1 storage (6 TB) with 20,000 IOPS reserved.
The storage delivers 500mb/sec-600mb/sec in fio tests.
CPU: Usually the 32 cores are 50-60% used but since mysql 8.0 it's idle (10-15% used)
RAM: 200GB are dedicated to mysql, before 8.0 it took not long and it was used. Now it takes many hours until mysql is able to fill the buffer.

The core problem: The speed dropped about 20 times in comparison to 5.7
Innodb/mysql is not using the disk efficiently.
Using the original configuration it was reading with 15 mb/sec (simple counts took minutes to run because the IBD file was not read properly)
I have since disabled performance schema, which helped to at least operate the server again at 10% load.
I disabled bin logging, maybe that helped a bit more, not sure.
I tried increasing read/write threads to 64 (that completely stalled the server)

I invested good 8 hours in trying to tune the mysql configuration, I managed to "push" mysql from using only 5mb/sec to now up to 50mb/sec.

To make sure:
Of course I did test disk IO from inside docker, it's exactly similar to outside.
The main disk is mostly idle, the other disks completely idle.
The system is 90% idle when it should be occupied.

Innodb status:
https://pastebin.com/XDgyNbk0

Mysql configuration:

[mysqld]
user=mysql
pid-file        = /var/run/mysqld/mysqld.pid
socket          = /var/run/mysqld/mysqld.sock
datadir         = /var/lib/mysql

tmpdir          = /mysql_tmp  

# compatibility
default_authentication_plugin=mysql_native_password
character_set_server=latin1
collation_server=latin1_swedish_ci



log-error       = /var/log/mysql_error.log
bind-address    = 0.0.0.0
sql_mode= "NO_ENGINE_SUBSTITUTION"

interactive_timeout     = 3600
wait_timeout            = 900

max_allowed_packet      = 64M
thread_stack            = 256K
thread_cache_size       = 192

max_connections         = 1600
max_user_connections    = 1500

#query_cache_limit       = 3M
#query_cache_size        = 200M
#query_cache_type       = 1
table_open_cache        = 2500


key_buffer_size          = 64M   # index in memory for myisam
innodb_buffer_pool_size = 190G
innodb_log_file_size = 256M
tmp_table_size = 250M
max_heap_table_size = 250M
join_buffer_size = 2M
#pagecleaners  - those were uncommented on 5.7
#innodb_buffer_pool_instances=8
#innodb_page_cleaners=2
innodb_io_capacity=5000
innodb_io_capacity_max=20000

# tried 64, that totally stalled the database
innodb_read_io_threads = 8
innodb_write_io_threads = 8

#in pre 5.7 times I had consistent 300mb/sec writes, now it's useless
innodb_lru_scan_depth=256

skip-name-resolve

secure_file_priv=""
#innodb_checksum_algorithm = crc32
#binlog_checksum = CRC32

# this one at least made it possible so I can go to bed, with performance_schema the database was unuseable    
performance_schema=OFF
skip-log-bin

Sysctl fs:

fs.aio-max-nr = 1048576
fs.aio-nr = 139264
fs.binfmt_misc.status = enabled
fs.dentry-state = 355223        335269  45      0       0       0
fs.dir-notify-enable = 1
fs.epoll.max_user_watches = 51660308
fs.file-max = 25224638
fs.file-nr = 19136      0       25224638
fs.inode-nr = 70145     5686
fs.inode-state = 70145  5686    0       0       0       0       0
fs.inotify.max_queued_events = 16384
fs.inotify.max_user_instances = 128
fs.inotify.max_user_watches = 8192
fs.lease-break-time = 45
fs.leases-enable = 1
fs.mqueue.msg_default = 10
fs.mqueue.msg_max = 10
fs.mqueue.msgsize_default = 8192
fs.mqueue.msgsize_max = 8192
fs.mqueue.queues_max = 256
fs.nr_open = 1048576
fs.overflowgid = 65534
fs.overflowuid = 65534
fs.pipe-max-size = 1048576
fs.protected_hardlinks = 1
fs.protected_symlinks = 1
fs.quota.allocated_dquots = 0
fs.quota.cache_hits = 0
fs.quota.drops = 0
fs.quota.free_dquots = 0
fs.quota.lookups = 0
fs.quota.reads = 0
fs.quota.syncs = 62
fs.quota.warnings = 1
fs.quota.writes = 0
fs.suid_dumpable = 0

iostat snapshot:

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
          13.86    0.71   10.84    3.23    0.11   71.26

Device:            tps    kB_read/s    kB_wrtn/s    kB_read    kB_wrtn
xvdh           1795.31     13923.89     26159.95  256609017  482112504

As you can see it's just reading at 14mb/sec, writing at 26mb/sec ..
With mysql 5.7 it was doing up to 200mb reading and writing.

The disk is basically idle. It can deliver 10 times the performance but for an unknown reason innodb/mysql are not doing it anymore.

Update: Variables and global status: https://pastebin.com/pzhXV7hq
https://pastebin.com/jBTYLbY6

Another change I was forced to make:
I have a hundred users connecting per second through apache/php, every connection usually triggered a "SELECT count(*) FROM information_schema.processlist" The new mysql wasn't able to do that anymore, it queued 50+ of those selects so I instead made an asynchronous task that inserts the processlist into an innodb table every 5 seconds.
It's just another sign how less performant the new mysql is reacting, it's even choked by processlists.

ulimit

   core file size          (blocks, -c) 0
    data seg size           (kbytes, -d) unlimited
    scheduling priority             (-e) 0
    file size               (blocks, -f) unlimited
    pending signals                 (-i) 985342
    max locked memory       (kbytes, -l) 64
    max memory size         (kbytes, -m) unlimited
    open files                      (-n) 65536
    pipe size            (512 bytes, -p) 8
    POSIX message queues     (bytes, -q) 819200
    real-time priority              (-r) 0
    stack size              (kbytes, -s) 8192
    cpu time               (seconds, -t) unlimited
    max user processes              (-u) 985342
    virtual memory          (kbytes, -v) unlimited
    file locks                      (-x) unlimited

Big update
Spent half a day reading into internas, taking your suggestions in consideration and what I was told on IRC.
I did the opposite of the professional: I made 10 changes at once, I can't afford so many restarts in a gradual change process:
1) Giving the system lots of parallel write opportunity without choking it

innodb_read_io_threads = 16  
innodb_write_io_threads = 16  
innodb_thread_concurrency=64  # cpus*2

2) Speeding up the background sync:

 innodb_lru_scan_depth=100

3) Disabling highest reliability settings that have a hard performance impact

performance_schema=OFF
skip-log-bin 
sync_binlog=0 
innodb_flush_log_at_trx_commit=0  # not crash safe, 0 is crash safe

4) More multithreading in backend memory

innodb_buffer_pool_instances=12

5) Increasing logfiles significantly, increasing logfile buffer moderatly

innodb_log_file_size = 3G # 
innodb_log_buffer_size = 64M

What happened: About 10 times increase in read performance,1.5x times write performance, I am not where I want to be but it's 15 times faster than before!
IOPS usage doubled from ~5-6k to 9k-12k, so I am at 60% IO usage
CPU usage increased from 7% to 50%

My goal would be 80% IO and CPU usage by the database, I think other variables are bottlenecking.

In real time use: I have a huge insert running before and after change (next to the usual load).
Before change the speed was about 3000 rows per second, after the above changes it's 8000 rows per second.

Thought I share this as the change in performance is extreme and I've only reached 50% of what should be possible.

Update
I think the problem can be considered half solved, I've made another update after the previous successful one and the performance is acceptable now.

The last changes involved write/read threads. I've put them to 32 each.
write buffer increased to 128M (for my heavy workload higher might be better)
logfiles increased to 8GB
buffer_pool_instances increased to 64 (max) for better memory fragmentation
page_cleaners increased to 64 (max) to have one for each buffer instance.

write performance increased by another ~20%, read performance increased by another ~30%.

It's been a 24 hour ride to get mysql perform acceptable, definitely not a simple upgrade.

Latest status:

Current configuration: https://pastebin.com/9vsbEQxt
show engine status innodb: https://pastebin.com/kCjnmtze
show global variables: https://pastebin.com/aMdQxWcA
global status: https://pastebin.com/VbG1yzHX

score 4 · Answer 1 · answered Jan 24 '19 at 02:46

Review of GLOBAL STATUS and VARIABLES

Observations:

Version: 8.0.13
240 GB of RAM
Uptime = 10:47:09; some GLOBAL STATUS values may not be meaningful yet.
You are not running on Windows.
Running 64-bit version
You appear to be running entirely (or mostly) InnoDB.

The More Important Issues:

Even though table_open_cache is high, it may be good to raise it further. Raising table_open_cache_instances to 32 may help, too.

innodb_buffer_pool_instances = 16; ditto for innodb_page_cleaners.

Increase innodb_log_file_size to 8G; see the 8.0 manual on details on how to achieve this (things have changed).

Why do you have innodb_flush_method = fsync? O_DIRECT is recommended in many situations.

Shrink long_quer_time; check the slowlog; there are some naughty queries we need to investigate.

CREATE TABLE every 5 seconds? What is going on? It is unusual to do more than a few such per day.

REPLACE is mostly supplanted by IODKU; what is your use case?

You are using various SHOW commands a lot. Their implementation has changed drastically in 8.0 and I would expect them to be faster in 8.0, but this could be one source of a slowdown.

10 TRUNCATE TABLE per minute? What's up? If you are replacing the contents of a table, it is usually better to create a new table, populate it, then use RENAME TABLE to atomically swap it into place.

Details and other observations:

( Innodb_buffer_pool_pages_flushed ) = 96,254,291 / 38829 = 2478 /sec -- Writes (flushes)

( Opened_tables ) = 104,524 / 38829 = 2.7 /sec -- Frequency of opening Tables -- increase table_open_cache

( table_open_cache ) = 19,000 -- Number of table descriptors to cache -- Several hundred is usually good.

( Table_open_cache_misses ) = 104,520 / 38829 = 2.7 /sec -- May need to increase table_open_cache

( innodb_buffer_pool_size / innodb_buffer_pool_instances ) = 181248M / 8 = 22656MB -- Size of each buffer_pool instance. -- An instance should be at least 1GB. In very large RAM, have 16 instances.

( innodb_page_cleaners / innodb_buffer_pool_instances ) = 4 / 8 = 0.5 -- page_cleaners -- Recommend setting innodb_page_cleaners to innodb_buffer_pool_instances

( (Innodb_buffer_pool_reads + Innodb_buffer_pool_pages_flushed) ) = ((23804167 + 96254291) ) / 38829 = 3091 /sec -- InnoDB I/O

( Innodb_os_log_written ) = 249,997,739,520 / 38829 = 6438428 /sec -- This is an indicator of how busy InnoDB is.

( Innodb_log_writes ) = 411,565,494 / 38829 = 10599 /sec

( Innodb_os_log_written / (Uptime / 3600) / innodb_log_files_in_group / innodb_log_file_size ) = 249,997,739,520 / (38829 / 3600) / 2 / 256M = 43.2 -- Ratio -- (see minutes)

( Uptime / 60 * innodb_log_file_size / Innodb_os_log_written ) = 38,829 / 60 * 256M / 249997739520 = 0.695 -- Minutes between InnoDB log rotations Beginning with 5.6.8, this can be changed dynamically; be sure to also change my.cnf. -- (The recommendation of 60 minutes between rotations is somewhat arbitrary.) Adjust innodb_log_file_size. (Cannot change in AWS.)

( innodb_flush_method ) = innodb_flush_method = fsync -- How InnoDB should ask the OS to write blocks. Suggest O_DIRECT or O_ALL_DIRECT (Percona) to avoid double buffering. (At least for Unix.) See chrischandler for caveat about O_ALL_DIRECT

( Innodb_row_lock_waits ) = 41,325 / 38829 = 1.1 /sec -- How often there is a delay in getting a row lock. -- May be caused by complex queries that could be optimized.

( Innodb_dblwr_writes ) = 1,177,834 / 38829 = 30 /sec -- "Doublewrite buffer" writes to disk. "Doublewrites" are a reliability feature. Some newer versions / configurations don't need them. -- (Symptom of other issues)

( Innodb_row_lock_current_waits ) = 54 -- The number of row locks currently being waited for by operations on InnoDB tables. Zero is pretty normal. -- Something big is going on?

( innodb_print_all_deadlocks ) = innodb_print_all_deadlocks = OFF -- Whether to log all Deadlocks. -- If you are plagued with Deadlocks, turn this on. Caution: If you have lots of deadlocks, this may write a lot to disk.

( Queries ) = 146,600,019 / 38829 = 3775 /sec -- Queries (including inside SP)

( Created_tmp_tables ) = 43,466,450 / 38829 = 1119 /sec -- Frequency of creating "temp" tables as part of complex SELECTs.

( Created_tmp_disk_tables ) = 69,721 / 38829 = 1.8 /sec -- Frequency of creating disk "temp" tables as part of complex SELECTs -- increase tmp_table_size and max_heap_table_size. Check the rules for temp tables on when MEMORY is used instead of MyISAM. Perhaps minor schema or query changes can avoid MyISAM. Better indexes and reformulation of queries are more likely to help.

( Select_full_join ) = 341,473 / 38829 = 8.8 /sec -- joins without index -- Add suitable index(es) to tables used in JOINs.

( Select_scan ) = 20,070,358 / 38829 = 516 /sec -- full table scans -- Add indexes / optimize queries (unless they are tiny tables)

( Select_scan / Com_select ) = 20,070,358 / 46219009 = 43.4% -- % of selects doing full table scan. (May be fooled by Stored Routines.) -- Add indexes / optimize queries

( Com_insert + Com_delete + Com_delete_multi + Com_replace + Com_update + Com_update_multi ) = (1395278 + 512340 + 0 + 1843851 + 14825066 + 9554) / 38829 = 478 /sec -- writes/sec -- 50 writes/sec + log flushes will probably max out I/O write capacity of normal drives

( Com_replace ) = 1,843,851 / 38829 = 47 /sec -- Consider changing to INSERT ... ON DUPLICATE KEY UPDATE.

( expire_logs_days ) = 0 -- How soon to automatically purge binlog (after this many days) -- Too large (or zero) = consumes disk space; too small = need to respond quickly to network/machine crash. (Not relevant if log_bin = OFF)

( slave_pending_jobs_size_max / max_allowed_packet ) = 128M / 64M = 2 -- For parallel slave threads -- slave_pending_jobs_size_max must not be less than max_allowed_packet

( long_query_time ) = 10 -- Cutoff (Seconds) for defining a "slow" query. -- Suggest 2

( back_log / max_connections ) = 1,600 / 1600 = 100.0%

( Connections ) = 1,054,868 / 38829 = 27 /sec -- Connections -- Increase wait_timeout; use pooling?

( thread_cache_size ) = 192 -- How many extra processes to keep around (Not relevant when using thread pooling) (Autosized as of 5.6.8; based on max_connections)

Abnormally large:

Bytes_received = 1583925 /sec
Com_begin = 11 /sec
Com_create_db = 0.093 /HR
Com_create_function = 4.1 /HR
Com_create_procedure = 0.37 /HR
Com_create_table = 0.21 /sec
Com_delete = 13 /sec
Com_drop_procedure = 0.37 /HR
Com_insert_select = 29 /sec
Com_insert_select + Com_replace_select = 75 /sec
Com_kill = 0.74 /HR
Com_replace_select = 47 /sec
Com_show_create_db = 2.2 /HR
Com_show_create_trigger = 0.28 /HR
Com_show_events = 0.74 /HR
Com_show_storage_engines = 0.93 /HR
Com_show_warnings = 16 /HR
Com_stmt_close = 919 /sec
Com_stmt_execute = 919 /sec
Com_stmt_prepare = 919 /sec
Com_truncate = 0.16 /sec
Com_update = 381 /sec
Com_update_multi = 0.25 /sec
Handler_commit = 1638 /sec
Handler_delete = 60 /sec
Handler_external_lock = 5083 /sec
Handler_read_key = 58652 /sec
Handler_read_next = 607861 /sec
Handler_update = 40571 /sec
Innodb_buffer_pool_bytes_dirty = 1,309.3MB
Innodb_buffer_pool_pages_data = 1.18e+7
Innodb_buffer_pool_pages_dirty = 83,894
Innodb_buffer_pool_pages_flushed / max(Questions, Queries) = 0.657
Innodb_buffer_pool_pages_misc = 664,249
Innodb_buffer_pool_pages_total = 1.25e+7
Innodb_data_fsyncs = 232 /sec
Innodb_data_pending_fsyncs = 0.46 /HR
Innodb_data_read = 12400226 /sec
Innodb_data_reads = 770 /sec
Innodb_data_writes = 13131 /sec
Innodb_data_written = 88604121 /sec
Innodb_dblwr_pages_written = 2476 /sec
Innodb_dblwr_pages_written / Innodb_dblwr_writes = 81.6
Innodb_log_write_requests = 10906 /sec
Innodb_os_log_fsyncs = 87 /sec
Innodb_os_log_written / (Uptime / 3600) / innodb_log_files_in_group = 11,052.3MB
Innodb_pages_created = 57 /sec
Innodb_pages_written = 2478 /sec
Innodb_rows_deleted = 60 /sec
Innodb_rows_deleted + Innodb_rows_inserted = 658 /sec
Innodb_rows_inserted = 598 /sec
Innodb_rows_updated = 520 /sec
Max_execution_time_set = 0.0MB
Max_used_connections = 768
Select_range = 1250 /sec
Select_range / Com_select = 105.0%
Sort_range = 274 /sec
Sort_scan = 509 /sec
Table_open_cache_hits = 3826 /sec
Threads_cached = 142
Threads_connected = 366
back_log = 1,600
innodb_io_capacity_max = 20,000
innodb_max_dirty_pages_pct_lwm = 1000.0%
innodb_undo_tablespaces = 2
max_error_count = 1,024
max_length_for_sort_data = 4,096
max_user_connections = 1,500
optimizer_trace_max_mem_size = 1.05e+6
slave_pending_jobs_size_max = 128MB

Abnormal strings:

bind_address = 0.0.0.0
event_scheduler = ON
explicit_defaults_for_timestamp = ON
have_query_cache = NO
have_ssl = YES
have_symlink = DISABLED
innodb_buffer_pool_dump_at_shutdown = ON
innodb_buffer_pool_load_at_startup = ON
innodb_fast_shutdown = 1
innodb_undo_directory = ./
innodb_undo_log_truncate = ON
master_info_repository = TABLE
optimizer_trace = enabled=off,one_line=off
optimizer_trace_features = greedy_search=on, range_optimizer=on, dynamic_range=on, repeated_subselect=on
relay_log_info_repository = TABLE
slave_rows_search_algorithms = INDEX_SCAN,HASH_SCAN
ssl_ca = ca.pem
ssl_cert = server-cert.pem
ssl_key = server-key.pem
transaction_write_set_extraction = XXHASH64

score 1 · Answer 2 · answered Jan 24 '19 at 02:53

UPDATE  task_s_b SET result_delivered=1, result_data='DISABLED',
        result_gathered=1
    WHERE  result_data is NULL
      AND  (assigned_counter >= 4
              OR  c_counter > 2
           )

Depending on the distribution of the values in that table, this query may be causing troubles.

Consider rewriting as two queries:

UPDATE  task_s_b SET result_delivered=1, result_data='DISABLED',
        result_gathered=1
    WHERE  result_data is NULL
      AND  assigned_counter >= 4;
UPDATE  task_s_b SET result_delivered=1, result_data='DISABLED',
        result_gathered=1
    WHERE  result_data is NULL
      AND  c_counter > 2;

And adding these two composite indexes with the columns in the specified order:

INDEX(result_data, assigned_counter)
INDEX(result_data, c_counter)

To discuss further, please provide SHOW CREATE TABLE and SHOW TABLE STATUS for that table.

This may cut back on deadlocks and speed up the server.

score 0 · Answer 3 · answered Jan 22 '19 at 21:05

Suggestions to consider with no stop/start of services for immediate relief based on available data at this time Jan 22, 19 at 15:00 CT USA

SET GLOBAL innodb_io_capacity=10000        from your 5000 limit at this time
SET GLOBAL innodb_lru_scan_depth=100        from your 256 to reduce CPU cycles used for this function by 60% EVERY SECOND

After posting of SHOW GLOBAL VARIABLES; analysis completed there will be more suggestions.

Extreme performance loss due to moving from Mysql 5.7.23 to Mysql 8.0 (status, config included)

3 Answers3