MariaDB intermittent crashes

Question

I have a MariaDB 10.2 database where the MariaDB service stops unexpectedly.

I regularly log the RAM usage of the top three processes for every minute using a cronjob, to determine if MariaDB consumes excessive memory before the crashes. I'm inclined to believe that excessive RAM usage is not the root cause of the issue.

When the service crashes, I reviewed the MariaDB logs using the following commands:

sudo cat /var/log/mariadb/mariadb.log - returns no results.
sudo systemctl status mariadb -l - yields the following output:

OS information:

[someuser ~]$ uname -a
Linux ec2.internal .amzn2.x86_64 #1 SMP Thu Dec 8 01:29:11 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
[someuser ~]$ hostnamectl
   Static hostname: <ip>.ec2.internal
         Icon name: computer-vm
           Chassis: vm
        Machine ID: -
           Boot ID: -
    Virtualization: xen
  Operating System: Amazon Linux 2
       CPE OS Name: cpe:2.3:o:amazon:amazon_linux:2
            Kernel: Linux <ip>.amzn2.x86_64
      Architecture: x86-64

I would appreciate your assistance in diagnosing and resolving this server stability problem.

Answers to the comment from @NikitaKipriyanov: Hi @NikitaKipriyanov, thank you for questions. Following your advice I ran dmesg -T | grep mysql and found out this (pasting only the relevant data):

[Mon Sep 18 17:37:34 2023] [  26231]    27 26231    29965       52    61440        0             0 mysql-prepare-d
[Tue Sep 19 14:04:29 2023] [  26839]    27 26839   367222    19121   630784        0             0 mysqld
[Tue Sep 19 14:04:31 2023] oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0,global_oom,task_memcg=/,task=mysqld,pid=26839,uid=27
[Tue Sep 19 14:04:31 2023] Out of memory: Killed process 26839 (mysqld) total-vm:1468888kB, anon-rss:76484kB, file-rss:0kB, shmem-rss:0kB, UID:27 pgtables:616kB oom_score_adj:0
[Tue Sep 19 14:05:41 2023] [  29800]    27 29800     2326       26    61440        0             0 mysql-check-soc
[Tue Sep 19 14:05:52 2023] [  29800]    27 29800     2326       26    61440        0             0 mysql-check-soc

The date and time correspond with when the service crashed.

When I run dmesg -T I see more often that the web server httpd has crashed:

[Tue Sep 19 14:13:00 2023] [  30195]     0 30195   143596     2011   839680        0             0 httpd
[Tue Sep 19 14:13:00 2023] oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0,global_oom,task_memcg=/,task=httpd,pid=28937,uid=48
[Tue Sep 19 14:13:00 2023] Out of memory: Killed process 28937 (httpd) total-vm:628120kB, anon-rss:11732kB, file-rss:0kB, shmem-rss:96kB, UID:48 pgtables:860kB oom_score_adj:0
[Tue Sep 19 14:13:17 2023] systemd-journald[30073]: Received SIGTERM from PID 1 (systemd).

However, the web server has never experienced crashes or interruptions in its operation. The issue only affects the MariaDB service. The issue is intermittent and occurs every three to five days.

Logs from /var/log/mariadb:

[root@ip mariadb]# ls -la
total 28
drwxr-x---  2 mysql mysql  146 Sep 20 03:48 .
drwxr-xr-x 10 root  root  4096 Sep 26 03:23 ..
-rw-------  1 mysql mysql    0 Sep 20 03:48 mariadb.log
-rw-------  1 mysql mysql 8714 Sep  7 12:40 mariadb.log-20230908
-rw-------  1 mysql mysql 2005 Sep 15 18:02 mariadb.log-20230916.gz
-rw-------  1 mysql mysql 1917 Sep 18 17:39 mariadb.log-20230919.gz
-rw-------  1 mysql mysql 1909 Sep 19 14:16 mariadb.log-20230920.gz
[root@ip mariadb]# pwd
/var/log/mariadb

Output of sysctl -A | grep panic:

[root@ip log]# sysctl -A | grep panic
fs.xfs.panic_mask = 0
kernel.hardlockup_panic = 0
kernel.hung_task_panic = 0
kernel.panic = 30
kernel.panic_on_io_nmi = 0
kernel.panic_on_oops = 0
kernel.panic_on_rcu_stall = 0
kernel.panic_on_unrecovered_nmi = 0
kernel.panic_on_warn = 0
kernel.panic_print = 0
kernel.softlockup_panic = 0
kernel.unknown_nmi_panic = 0
sysctl: reading key "net.ipv6.conf.all.stable_secret"
sysctl: reading key "net.ipv6.conf.default.stable_secret"
sysctl: reading key "net.ipv6.conf.eth0.stable_secret"
sysctl: reading key "net.ipv6.conf.lo.stable_secret"
vm.panic_on_oom = 0

Answers to Wilson Hauck questions:

Regarding RAM:

[ec2-user@ip ~]$ free -h
              total        used        free      shared  buff/cache   available
Mem:           964M        333M        197M        724K        433M        451M
Swap:            0B          0B          0B

SELECT COUNT(*) FROM information_schema.tables; - 238

SHOW GLOBAL STATUS; - Returns too many rows to be pasted here, which value are you looking for

SHOW GLOBAL VARIABLES;- Same

How should I send the output of all requested queries?

score 1 · Answer 1 · answered Sep 26 '23 at 13:40

Do I get a prize for guessing correctly?

There's lots of answers here and elsewhere on the internet on how to address this. Unfortunately a lot of them are wrong. If they mention oom_adj or oom_score_adj then this is NOT how to fix your issue.

There are 2 things you need to do.

The first is to look at your configuration for your webserver and the database (and presumably there some application tier in here as well) to ensure that they don't try to use up more RAM than you have available. Since you didn't tell us about the application server nor the webserver I can't really advise (but here's a hint for pre-fork Apache, optionally with PHP). As for Mariadb - go get a copy of mysqltuner.pl and run it against your installation.

That should prevent you running out of memory most of the time, but you should also reduce the amount of non-existent memory the kernel will hand out:

sysctl vm.overcommit_memory=2
sysctl vm.overcommit_ratio=20

...and try lower values for the ratio if you are still seeing OOM killer.

score 1 · Answer 2 · edited Jan 08 '25 at 05:43

sysctl vm.overcommit_memory=2

sysctl vm.overcommit_ratio=20

is dangerous as it gets "cannot allocate memory" errors for every bash command and nothing will operate including postfix and apache till I rebooted my Linux EC2 instance in aws.amazon.com console and the original safe vales of 0 and 50 were reset respectively so I could ssh in and all the services were restarted OK.

If mysqld crashes with oom-kill memory outage just restart it - same old:

service mysqld start

Do not meddle with vm.overcommit_memory or vm.overcommit_ratio values or risk losing all on the server unless backed up.

MariaDB intermittent crashes

2 Answers2