2

I have MariaDB 10.5 on my desktop with multiple disks (SSD and HDD) for write-intensive projects. Writing to a single table is fast and the percentage of dirty pages remains close to zero with 1000-3000 writes/s.

However, when I actively write to multiple tables at the same time, the percentage of dirty pages quickly goes up. The problem is that flush to the disk drops to the level of 100 writes/s and remains at that level.

This behaviour remains until a restart.

I think the problem is somehow related (not exactly) to that identified by Percona 10 years ago.

Is there any trick to keep the speed of flushing?

key_buffer_size     = 20M
max_allowed_packet  = 5G
thread_stack        = 256K
thread_cache_size       = 8
innodb_buffer_pool_size = 70G
innodb_log_buffer_size = 512M
innodb_log_file_size = 20G
innodb_thread_concurrency = 0
innodb_flush_log_at_trx_commit = 0
innodb_compression_level = 6
innodb_io_capacity=2000
innodb_io_capacity_max=30000
innodb_max_dirty_pages_pct=0
innodb_doublewrite = 0
innodb_flush_method = O_DIRECT
innodb_lru_scan_depth=128
innodb_purge_threads=8
innodb_purge_batch_size=600
innodb_flush_neighbors=0
innodb_change_buffer_max_size=50
innodb_buffer_pool_load_at_startup=OFF
innodb_buffer_pool_dump_at_shutdown=OFF
innodb-ft-result-cache-limit=4G
innodb_fatal_semaphore_wait_threshold=7200
innodb_compression_default=ON
innodb_random_read_ahead=1

UPDATE: Possible Solution

I do not post this, as I am not sure if it is the real solution. After much experimentations, I found the problem is adaptive flushing. I tackled the problem by

innodb_adaptive_flushing=0
innodb_adaptive_flushing_lwm=70

Apparently, when adaptive flushing is triggered to avoid high I/O, it stays for a long time.

UPDATE2: page vs column compression

I identified the problem to be

innodb_compression_default=ON

Following a suggestion by Rick James, I created similar tables with column compression instead of page compression. The compression is about 300% for both methods (10-20% better with page compression, as applies to the whole table rather than selective columns), but the performance was significantly different on HDD.

I think the problem is when writing to multiple sparse files created by page compression at the same time on an HDD (it should not be an issue on SSD).

I need to recreate all the tables to be sure, and the process is painfully time-consuming.

Rick James
  • 80,479
  • 5
  • 52
  • 119
Googlebot
  • 4,551
  • 26
  • 70
  • 96

4 Answers4

1

I suspect that innodb_io_capacity_max=30000 is much too large. Try 5000.

innodb_max_dirty_pages_pct=0 -- Dirty pages are good; don't try to avoid them with "0". The default is 75 (percent); MariaDB 10.5.7 decided that a better default is 90. Try one of those. Note that that setting is GLOBAL and dynamic, so no restart is needed.

By not aggressively flushing 'dirty' pages, you are providing the possibility that a block ('page') will be written to more than once before it actually needs to be written to disk.

What tool is telling you "100 writes/s" ?

For a deeper dive, please provide the Global Status and Variables: http://mysql.rjweb.org/doc.php/mysql_analysis#tuning

Rick James
  • 80,479
  • 5
  • 52
  • 119
1

100 disk writes per second -- That sounds like the max speed of an HDD.

100 rows written per second -- That sounds like a very inefficient way to do INSERT (or UPDATE).

autocommit = ON and not inside BEGIN..COMMIT, plus innodb_flush_log_at_trx_commit = 1 and/or sync_binlog = 1 -- That sounds like there will be a flush to the log for every statement.

Show us the "write" statement. Let's discuss how you can "batch" INSERT, thereby avoiding the flush/sync for every row.

Rick James
  • 80,479
  • 5
  • 52
  • 119
1

With MariaDB and "rows usually have long mediumtext fields", consider using column compression.

Rick James
  • 80,479
  • 5
  • 52
  • 119
1

All the optimizations in InnoDB for dirty pages, redo logging, page flushing, io capacity etc. are designed with the assumption that high write traffic is intermittent. That is, it depends on there being period of low write traffic so all the deferred flushing can "catch up."

If you have a very high, continuous rate of writes, and your InnoDB engine can't keep up, then you can't keep throwing more write traffic at a single disk device. There's a hard ceiling to the amount of I/O any single disk can handle.

So if other means of optimization are tapped out, then your options may be:

  • Move to a more high-end I/O system, for example RAID0 or RAID10. Even if you have an SSD disk, that's not infinitely scalable. A striped array of SSD disks is superior to a single SSD disk.

  • Distribute writes to multiple MySQL instances, even if they are sharing the same storage. You might have the most powerful I/O array in the world, so eventually your bottleneck will be InnoDB itself.

  • Distribute writes to multiple server hosts. You gain more parallel I/O capacity by using more servers. I support some apps that need to spread out writes over hundreds of shards, running in docker containers, using between 2 and 8 containers per physical host. Whatever it takes to get enough parallelization and I/O capacity to handle the write traffic.

If you need to scale further, you may consider that MySQL being designed to optimize OLTP traffic, is not the best technology for a high rate of writes. You may want to evaluate RocksDB or ScyllaDB or other specialized log-structured data services.

Bill Karwin
  • 16,963
  • 3
  • 31
  • 45