1

I'm trying to bootstrap a MariaDB Galera cluster in Docker containers. The following configuration works when I set network_mode: host, but then I can't directly access the MariaDB from inside other containers, and I'd rather use ports published by Docker instead of directly binding to the host interface. Every container is running on a separate host. This is not a Docker Swarm or Kubernetes setup, just three separate hosts with Docker.

Basically I've been converting the commands from the Bitnami Galera image documentation into an Ansible playbook.

tasks:
  - name: Deploy mysql bootstrap container
    docker_container:
      name: mariadb
      image: "bitnami/mariadb-galera:10.5"
      pull: yes
      restart_policy: "unless-stopped"
      # network_mode: host
      ports:
        - "0.0.0.0:3306:3306"
        - "0.0.0.0:4567:4567"
        - "0.0.0.0:4567:4567/udp"
        - "0.0.0.0:4568:4568"
        - "0.0.0.0:4444:4444"
      volumes:
        - "{{ mariadb_datadir }}:/bitnami/mariadb"
      env:
        MARIADB_DATABASE: "{{ nextcloud_db_name }}"
        MARIADB_USER: "{{ nextcloud_db_username }}"
        MARIADB_PASSWORD: "{{ nextcloud_db_password }}"
        MARIADB_ROOT_PASSWORD: "{{ mariadb_root_password }}"
        MARIADB_GALERA_CLUSTER_NAME: "{{ mariadb_galera_cluster_name }}"
        MARIADB_GALERA_MARIABACKUP_PASSWORD: "{{ mariadb_galera_mariabackup_password }}"
        MARIADB_GALERA_CLUSTER_BOOTSTRAP: "{{ mariadb_galera_cluster_bootstrap }}"
        MARIADB_REPLICATION_PASSWORD: "{{ mariadb_replication_password }}"
        MARIADB_GALERA_CLUSTER_ADDRESS: "gcomm://"
    when: inventory_hostname == groups[mariadb_galera_cluster_name][0]
  - name: Wait for Galera Cluster status
    shell:
      cmd: "docker exec -i mariadb mysql -u root -p{{ mariadb_root_password }} -s -e \"SELECT variable_value FROM information_schema.global_status WHERE variable_name='WSREP_LOCAL_STATE_COMMENT'\""
    register: galera_status
    retries: 60
    delay: 10
    until: galera_status.stdout == 'Synced'
    when: inventory_hostname == groups[mariadb_galera_cluster_name][0]
  - name: Deploy mysql cluster containers
    docker_container:
      name: mariadb
      image: "bitnami/mariadb-galera:10.5"
      pull: yes
      restart_policy: "unless-stopped"
      # network_mode: host
      ports:
        - "0.0.0.0:3306:3306"
        - "0.0.0.0:4567:4567"
        - "0.0.0.0:4567:4567/udp"
        - "0.0.0.0:4568:4568"
        - "0.0.0.0:4444:4444"
      volumes:
        - "{{ mariadb_datadir }}:/bitnami/mariadb"
      env:
        MARIADB_ROOT_PASSWORD: "{{ mariadb_root_password }}"
        MARIADB_GALERA_CLUSTER_NAME: "{{ mariadb_galera_cluster_name }}"
        MARIADB_GALERA_MARIABACKUP_PASSWORD: "{{ mariadb_galera_mariabackup_password }}"
        MARIADB_REPLICATION_PASSWORD: "{{ mariadb_replication_password }}"
        MARIADB_GALERA_CLUSTER_ADDRESS: "gcomm://{{ groups[mariadb_galera_cluster_name] | map('extract', hostvars, ['ansible_default_ipv4', 'address']) | join(':4567,') }}:4567"
    when: inventory_hostname != groups[mariadb_galera_cluster_name][0]

When I comment out the network_mode and publish the ports directly the containers start up, but the nodes can't join the cluster, I get a lot of these messages:

2021-07-28 9:31:51 0 [Warning] WSREP: Member 0.0 (f38e8cff30b6) requested state transfer from '*any*', but it is impossible to select State Transfer donor: Resource temporarily unavailable

which ends with:

WSREP_SST: [ERROR] Possible timeout in receiving first data from donor in gtid stage: exit codes: 124 0 (20210728 09:31:22.716)

Full log of one of the nodes:

mariadb 09:26:21.27
mariadb 09:26:21.28 Welcome to the Bitnami mariadb-galera container
mariadb 09:26:21.28 Subscribe to project updates by watching https://github.com/bitnami/bitnami-docker-mariadb-galera
mariadb 09:26:21.29 Submit issues and feature requests at https://github.com/bitnami/bitnami-docker-mariadb-galera/issues
mariadb 09:26:21.29
mariadb 09:26:21.29 INFO  ==> ** Starting MariaDB setup **
mariadb 09:26:21.31 INFO  ==> Validating settings in MYSQL_*/MARIADB_* env vars
mariadb 09:26:21.34 INFO  ==> Initializing mariadb database
mariadb 09:26:21.36 INFO  ==> Updating 'my.cnf' with custom configuration
mariadb 09:26:21.37 INFO  ==> Setting wsrep_node_name option
mariadb 09:26:21.38 INFO  ==> Setting wsrep_node_address option
mariadb 09:26:21.39 INFO  ==> Setting wsrep_cluster_name option
mariadb 09:26:21.42 INFO  ==> Setting wsrep_cluster_address option
mariadb 09:26:21.43 INFO  ==> Setting wsrep_sst_auth option
mariadb 09:26:21.45 INFO  ==> ** MariaDB setup finished! **

mariadb 09:26:21.49 INFO ==> ** Starting MariaDB ** mariadb 09:26:21.49 INFO ==> Setting previous boot 2021-07-28 9:26:21 0 [Note] /opt/bitnami/mariadb/sbin/mysqld (mysqld 10.5.11-MariaDB-log) starting as process 1 ... 2021-07-28 9:26:21 0 [Note] WSREP: Loading provider /opt/bitnami/mariadb/lib/libgalera_smm.so initial position: 00000000-0000-0000-0000-000000000000:-1 2021-07-28 9:26:21 0 [Note] WSREP: wsrep_load(): loading provider library '/opt/bitnami/mariadb/lib/libgalera_smm.so' 2021-07-28 9:26:21 0 [Note] WSREP: wsrep_load(): Galera 4.8(rXXXX) by Codership Oy <info@codership.com> loaded successfully. 2021-07-28 9:26:21 0 [Note] WSREP: CRC-32C: using 64-bit x86 acceleration. 2021-07-28 9:26:21 0 [Warning] WSREP: Could not open state file for reading: '/bitnami/mariadb/data//grastate.dat' 2021-07-28 9:26:21 0 [Note] WSREP: Found saved state: 00000000-0000-0000-0000-000000000000:-1, safe_to_bootstrap: 1 2021-07-28 9:26:21 0 [Note] WSREP: GCache DEBUG: opened preamble: Version: 0 UUID: 00000000-0000-0000-0000-000000000000 Seqno: -1 - -1 Offset: -1 Synced: 0 2021-07-28 9:26:21 0 [Note] WSREP: Skipped GCache ring buffer recovery: could not determine history UUID. 2021-07-28 9:26:21 0 [Note] WSREP: Passing config to GCS: base_dir = /bitnami/mariadb/data/; base_host = 172.17.0.2; base_port = 4567; cert.log_conflicts = no; cert.optimistic_pa = yes; debug = no; evs.auto_evict = 0; evs.delay_margin = PT1S; evs.delayed_keep_period = PT30S; evs.inactive_check_period = PT0.5S; evs.inactive_timeout = PT15S; evs.join_retrans_period = PT1S; evs.max_install_timeouts = 3; evs.send_window = 4; evs.stats_report_period = PT1M; evs.suspect_timeout = PT5S; evs.user_send_window = 2; evs.view_forget_timeout = PT24H; gcache.dir = /bitnami/mariadb/data/; gcache.keep_pages_size = 0; gcache.mem_size = 0; gcache.name = galera.cache; gcache.page_size = 128M; gcache.recover = yes; gcache.size = 128M; gcomm.thread_prio = ; gcs.fc_debug = 0; gcs.fc_factor = 1.0; gcs.fc_limit = 16; gcs.fc_master_slave = no; gcs.max_packet_size = 64500; gcs.max_throttle = 0.25; gcs.recv_q_hard_limit = 9223372036854775807; gcs.recv_q_soft_limit = 0.25; gcs.sync_donor = no; gmcast.segment = 0; gmcast.version = 0; pc.announce_timeout = PT3S; pc 2021-07-28 9:26:21 0 [Note] WSREP: Start replication 2021-07-28 9:26:21 0 [Note] WSREP: Connecting with bootstrap option: 0 2021-07-28 9:26:21 0 [Note] WSREP: Setting GCS initial position to 00000000-0000-0000-0000-000000000000:-1 2021-07-28 9:26:21 0 [Note] WSREP: protonet asio version 0 2021-07-28 9:26:21 0 [Note] WSREP: Using CRC-32C for message checksums. 2021-07-28 9:26:21 0 [Note] WSREP: backend: asio 2021-07-28 9:26:21 0 [Note] WSREP: gcomm thread scheduling priority set to other:0 2021-07-28 9:26:21 0 [Warning] WSREP: access file(/bitnami/mariadb/data//gvwstate.dat) failed(No such file or directory) 2021-07-28 9:26:21 0 [Note] WSREP: restore pc from disk failed 2021-07-28 9:26:21 0 [Note] WSREP: GMCast version 0 2021-07-28 9:26:21 0 [Note] WSREP: (def73391-b951, 'tcp://0.0.0.0:4567') listening at tcp://0.0.0.0:4567 2021-07-28 9:26:21 0 [Note] WSREP: (def73391-b951, 'tcp://0.0.0.0:4567') multicast: , ttl: 1 2021-07-28 9:26:21 0 [Note] WSREP: EVS version 1 2021-07-28 9:26:21 0 [Note] WSREP: gcomm: connecting to group 'nextcloud_int', peer '10.0.28.26:4567,10.0.28.31:4567,10.0.28.32:4567' 2021-07-28 9:26:21 0 [Note] WSREP: (def73391-b951, 'tcp://0.0.0.0:4567') Found matching local endpoint for a connection, blacklisting address tcp://10.0.28.31:4567 2021-07-28 9:26:21 0 [Note] WSREP: (def73391-b951, 'tcp://0.0.0.0:4567') connection established to d9e457ac-8a0d tcp://10.0.28.26:4567 2021-07-28 9:26:21 0 [Note] WSREP: (def73391-b951, 'tcp://0.0.0.0:4567') turning message relay requesting on, nonlive peers: 2021-07-28 9:26:21 0 [Note] WSREP: (def73391-b951, 'tcp://0.0.0.0:4567') connection established to df17c022-aae4 tcp://10.0.28.32:4567 2021-07-28 9:26:22 0 [Note] WSREP: EVS version upgrade 0 -> 1 2021-07-28 9:26:22 0 [Note] WSREP: declaring d9e457ac-8a0d at tcp://10.0.28.26:4567 stable 2021-07-28 9:26:22 0 [Note] WSREP: declaring df17c022-aae4 at tcp://10.0.28.32:4567 stable 2021-07-28 9:26:22 0 [Note] WSREP: PC protocol upgrade 0 -> 1 2021-07-28 9:26:22 0 [Note] WSREP: Node d9e457ac-8a0d state prim 2021-07-28 9:26:22 0 [Note] WSREP: view(view_id(PRIM,d9e457ac-8a0d,2) memb { d9e457ac-8a0d,0 def73391-b951,0 df17c022-aae4,0 } joined { } left { } partitioned { }) 2021-07-28 9:26:22 0 [Note] WSREP: save pc into disk 2021-07-28 9:26:22 0 [Note] WSREP: gcomm: connected 2021-07-28 9:26:22 0 [Note] WSREP: Changing maximum packet size to 64500, resulting msg size: 32636 2021-07-28 9:26:22 0 [Note] WSREP: Shifting CLOSED -> OPEN (TO: 0) 2021-07-28 9:26:22 0 [Note] WSREP: Opened channel 'nextcloud_int' 2021-07-28 9:26:22 0 [Note] WSREP: New COMPONENT: primary = yes, bootstrap = no, my_idx = 1, memb_num = 3 2021-07-28 9:26:22 0 [Note] WSREP: STATE EXCHANGE: Waiting for state UUID. 2021-07-28 9:26:22 1 [Note] WSREP: Starting rollbacker thread 1 2021-07-28 9:26:22 0 [Note] WSREP: STATE EXCHANGE: sent state msg: df444d26-ef85-11eb-8936-be401295de77 2021-07-28 9:26:22 0 [Note] WSREP: STATE EXCHANGE: got state msg: df444d26-ef85-11eb-8936-be401295de77 from 0 (e4a7a8af0b78) 2021-07-28 9:26:22 0 [Note] WSREP: STATE EXCHANGE: got state msg: df444d26-ef85-11eb-8936-be401295de77 from 2 (cd8aaaad506b) 2021-07-28 9:26:22 2 [Note] WSREP: Starting applier thread 2 2021-07-28 9:26:22 0 [Note] WSREP: STATE EXCHANGE: got state msg: df444d26-ef85-11eb-8936-be401295de77 from 1 (f38e8cff30b6) 2021-07-28 9:26:22 0 [Note] WSREP: Quorum results: version = 6, component = PRIMARY, conf_id = 1, members = 1/3 (joined/total), act_id = 15, last_appl. = 14, protocols = 2/10/4 (gcs/repl/appl), vote policy= 0, group UUID = d6aa7099-ef85-11eb-9fbe-dbb821267db3 2021-07-28 9:26:22 0 [Note] WSREP: Flow-control interval: [28, 28] 2021-07-28 9:26:22 0 [Note] WSREP: Shifting OPEN -> PRIMARY (TO: 16) 2021-07-28 9:26:22 2 [Note] WSREP: ####### processing CC 16, local, ordered 2021-07-28 9:26:22 2 [Note] WSREP: Process first view: d6aa7099-ef85-11eb-9fbe-dbb821267db3 my uuid: def73391-ef85-11eb-b951-cb71f524bed3 2021-07-28 9:26:22 2 [Note] WSREP: Server f38e8cff30b6 connected to cluster at position d6aa7099-ef85-11eb-9fbe-dbb821267db3:16 with ID def73391-ef85-11eb-b951-cb71f524bed3 2021-07-28 9:26:22 2 [Note] WSREP: Server status change disconnected -> connected 2021-07-28 9:26:22 2 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification. 2021-07-28 9:26:22 2 [Note] WSREP: ####### My UUID: def73391-ef85-11eb-b951-cb71f524bed3 2021-07-28 9:26:22 2 [Note] WSREP: Cert index reset to 00000000-0000-0000-0000-000000000000:-1 (proto: 10), state transfer needed: yes 2021-07-28 9:26:22 0 [Note] WSREP: Service thread queue flushed. 2021-07-28 9:26:22 2 [Note] WSREP: ####### Assign initial position for certification: 00000000-0000-0000-0000-000000000000:-1, protocol version: -1 2021-07-28 9:26:22 2 [Note] WSREP: State transfer required: Group state: d6aa7099-ef85-11eb-9fbe-dbb821267db3:16 Local state: 00000000-0000-0000-0000-000000000000:-1 2021-07-28 9:26:22 2 [Note] WSREP: Server status change connected -> joiner 2021-07-28 9:26:22 2 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification. 2021-07-28 9:26:22 0 [Note] WSREP: Joiner monitor thread started to monitor 2021-07-28 9:26:22 0 [Note] WSREP: Running: 'wsrep_sst_mariabackup --role 'joiner' --address '172.17.0.2' --datadir '/bitnami/mariadb/data/' --defaults-file '/opt/bitnami/mariadb/conf/my.cnf' --parent '1' --binlog 'mysql-bin' --mysqld-args --defaults-file=/opt/bitnami/mariadb/conf/my.cnf --basedir=/opt/bitnami/mariadb --datadir=/bitnami/mariadb/data --socket=/opt/bitnami/mariadb/tmp/mysql.sock --pid-file=/opt/bitnami/mariadb/tmp/mysqld.pid' WSREP_SST: [INFO] SSL configuration: CA='', CERT='', KEY='', MODE='DISABLED', encrypt='0' (20210728 09:26:22.594) WSREP_SST: [INFO] Streaming with mbstream (20210728 09:26:22.684) WSREP_SST: [INFO] Using socat as streamer (20210728 09:26:22.688) WSREP_SST: [INFO] Evaluating timeout -k 310 300 socat -u TCP-LISTEN:4444,reuseaddr stdio | '/opt/bitnami/mariadb//bin/mbstream' -x; RC=( ${PIPESTATUS[@]} ) (20210728 09:26:22.709) 2021-07-28 9:26:23 2 [Note] WSREP: ####### IST uuid:00000000-0000-0000-0000-000000000000 f: 0, l: 16, STRv: 3 2021-07-28 9:26:23 2 [Note] WSREP: IST receiver addr using tcp://172.17.0.2:4568 2021-07-28 9:26:23 2 [Note] WSREP: Prepared IST receiver for 0-16, listening at: tcp://172.17.0.2:4568 2021-07-28 9:26:23 0 [Note] WSREP: Member 1.0 (f38e8cff30b6) requested state transfer from 'any'. Selected 0.0 (e4a7a8af0b78)(SYNCED) as donor. 2021-07-28 9:26:23 0 [Note] WSREP: Shifting PRIMARY -> JOINER (TO: 16) 2021-07-28 9:26:23 2 [Note] WSREP: Requesting state transfer: success, donor: 0 2021-07-28 9:26:23 2 [Note] WSREP: Resetting GCache seqno map due to different histories. 2021-07-28 9:26:23 2 [Note] WSREP: GCache history reset: 00000000-0000-0000-0000-000000000000:0 -> d6aa7099-ef85-11eb-9fbe-dbb821267db3:16 2021-07-28 9:26:23 0 [Warning] WSREP: Member 2.0 (cd8aaaad506b) requested state transfer from 'any', but it is impossible to select State Transfer donor: Resource temporarily unavailable 2021-07-28 9:26:24 0 [Warning] WSREP: Member 2.0 (cd8aaaad506b) requested state transfer from 'any', but it is impossible to select State Transfer donor: Resource temporarily unavailable 2021-07-28 9:26:25 0 [Note] WSREP: (def73391-b951, 'tcp://0.0.0.0:4567') turning message relay requesting off 2021-07-28 9:26:25 0 [Warning] WSREP: Member 2.0 (cd8aaaad506b) requested state transfer from 'any', but it is impossible to select State Transfer donor: Resource temporarily unavailable 2021-07-28 9:26:26 0 [Warning] WSREP: Member 2.0 (cd8aaaad506b) requested state transfer from 'any', but it is impossible to select State Transfer donor: Resource temporarily unavailable 2021-07-28 9:26:27 0 [Warning] WSREP: Member 2.0 (cd8aaaad506b) requested state transfer from 'any', but it is impossible to select State Transfer donor: Resource temporarily unavailable 2021-07-28 9:26:28 0 [Warning] WSREP: Member 2.0 (cd8aaaad506b) requested state transfer from 'any', but it is impossible to select State Transfer donor: Resource temporarily unavailable 2021-07-28 9:26:29 0 [Warning] WSREP: Member 2.0 (cd8aaaad506b) requested state transfer from 'any', but it is impossible to select State Transfer donor: Resource temporarily unavailable

[... message repeats every second ...]

2021-07-28 9:31:20 0 [Warning] WSREP: Member 2.0 (cd8aaaad506b) requested state transfer from 'any', but it is impossible to select State Transfer donor: Resource temporarily unavailable 2021-07-28 9:31:21 0 [Warning] WSREP: Member 2.0 (cd8aaaad506b) requested state transfer from 'any', but it is impossible to select State Transfer donor: Resource temporarily unavailable 2021-07-28 9:31:22 0 [Warning] WSREP: Member 2.0 (cd8aaaad506b) requested state transfer from 'any', but it is impossible to select State Transfer donor: Resource temporarily unavailable WSREP_SST: [ERROR] Possible timeout in receiving first data from donor in gtid stage: exit codes: 124 0 (20210728 09:31:22.716) WSREP_SST: [ERROR] Cleanup after exit with status:32 (20210728 09:31:22.718) WSREP_SST: [INFO] Removing the sst_in_progress file (20210728 09:31:22.721) WSREP_SST: [INFO] Cleaning up temporary directories (20210728 09:31:22.724) 2021-07-28 9:31:22 0 [ERROR] WSREP: Process completed with error: wsrep_sst_mariabackup --role 'joiner' --address '172.17.0.2' --datadir '/bitnami/mariadb/data/' --defaults-file '/opt/bitnami/mariadb/conf/my.cnf' --parent '1' --binlog 'mysql-bin' --mysqld-args --defaults-file=/opt/bitnami/mariadb/conf/my.cnf --basedir=/opt/bitnami/mariadb --datadir=/bitnami/mariadb/data --socket=/opt/bitnami/mariadb/tmp/mysql.sock --pid-file=/opt/bitnami/mariadb/tmp/mysqld.pid: 32 (Broken pipe) 2021-07-28 9:31:22 0 [ERROR] WSREP: Failed to read uuid:seqno and wsrep_gtid_domain_id from joiner script. 2021-07-28 9:31:22 3 [Note] WSREP: SST received 2021-07-28 9:31:22 3 [Note] WSREP: SST received: 00000000-0000-0000-0000-000000000000:-1 2021-07-28 9:31:22 3 [Note] WSREP: SST succeeded for position 00000000-0000-0000-0000-000000000000:-1 2021-07-28 9:31:22 2 [ERROR] WSREP: Application received wrong state: Received: 00000000-0000-0000-0000-000000000000 Required: d6aa7099-ef85-11eb-9fbe-dbb821267db3 2021-07-28 9:31:22 0 [Note] WSREP: Joiner monitor thread ended with total time 300 sec 2021-07-28 9:31:22 2 [ERROR] WSREP: Application state transfer failed. This is unrecoverable condition, restart required. 2021-07-28 9:31:22 2 [Note] WSREP: ReplicatorSMM::abort() 2021-07-28 9:31:22 2 [Note] WSREP: Closing send monitor... 2021-07-28 9:31:22 2 [Note] WSREP: Closed send monitor. 2021-07-28 9:31:22 2 [Note] WSREP: gcomm: terminating thread 2021-07-28 9:31:22 2 [Note] WSREP: gcomm: joining thread 2021-07-28 9:31:22 2 [Note] WSREP: gcomm: closing backend 2021-07-28 9:31:22 2 [Note] WSREP: view(view_id(NON_PRIM,d9e457ac-8a0d,2) memb { def73391-b951,0 } joined { } left { } partitioned { d9e457ac-8a0d,0 df17c022-aae4,0 }) 2021-07-28 9:31:22 2 [Note] WSREP: PC protocol downgrade 1 -> 0 2021-07-28 9:31:22 2 [Note] WSREP: view((empty)) 2021-07-28 9:31:22 2 [Note] WSREP: gcomm: closed 2021-07-28 9:31:22 0 [Note] WSREP: New COMPONENT: primary = no, bootstrap = no, my_idx = 0, memb_num = 1 2021-07-28 9:31:22 0 [Note] WSREP: Flow-control interval: [16, 16] 2021-07-28 9:31:22 0 [Note] WSREP: Received NON-PRIMARY. 2021-07-28 9:31:22 0 [Note] WSREP: Shifting JOINER -> OPEN (TO: 16) 2021-07-28 9:31:22 0 [Note] WSREP: New SELF-LEAVE. 2021-07-28 9:31:22 0 [Note] WSREP: Flow-control interval: [0, 0] 2021-07-28 9:31:22 0 [Note] WSREP: Received SELF-LEAVE. Closing connection. 2021-07-28 9:31:22 0 [Note] WSREP: Shifting OPEN -> CLOSED (TO: 16) 2021-07-28 9:31:22 0 [Note] WSREP: RECV thread exiting 0: Success 2021-07-28 9:31:22 2 [Note] WSREP: recv_thread() joined. 2021-07-28 9:31:22 2 [Note] WSREP: Closing replication queue. 2021-07-28 9:31:22 2 [Note] WSREP: Closing slave action queue. 2021-07-28 9:31:22 2 [Note] WSREP: /opt/bitnami/mariadb/sbin/mysqld: Terminated. 210728 9:31:22 [ERROR] mysqld got signal 11 ; This could be because you hit a bug. It is also possible that this binary or one of the libraries it was linked against is corrupt, improperly built, or misconfigured. This error can also be caused by malfunctioning hardware.

To report this bug, see https://mariadb.com/kb/en/reporting-bugs

We will try our best to scrape up some info that will hopefully help diagnose the problem, but since we have already crashed, something is definitely wrong and this may fail.

Server version: 10.5.11-MariaDB-log key_buffer_size=0 read_buffer_size=131072 max_used_connections=0 max_threads=153 thread_count=2 It is possible that mysqld could use up to key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 336801 K bytes of memory Hope that's ok; if not, decrease some variables in the equation.

Thread pointer: 0x7fc554000c18 Attempting backtrace. You can use the following information to find out where mysqld died. If you see no messages after this, something went terribly wrong... stack_bottom = 0x7fc5780e2e18 thread_stack 0x49000 /opt/bitnami/mariadb/sbin/mysqld(my_print_stacktrace+0x2e)[0x555bfab25a3e] Printing to addr2line failed /opt/bitnami/mariadb/sbin/mysqld(handle_fatal_signal+0x485)[0x555bfa5e2ad5] /lib/x86_64-linux-gnu/libpthread.so.0(+0x12730)[0x7fc57ab51730] /lib/x86_64-linux-gnu/libc.so.6(abort+0x1fd)[0x7fc57a678611] /opt/bitnami/mariadb/lib/libgalera_smm.so(+0x1cf6ac)[0x7fc57a44f6ac] /opt/bitnami/mariadb/lib/libgalera_smm.so(+0x5d863)[0x7fc57a2dd863] /opt/bitnami/mariadb/lib/libgalera_smm.so(+0x79fe5)[0x7fc57a2f9fe5] /opt/bitnami/mariadb/lib/libgalera_smm.so(+0x681da)[0x7fc57a2e81da] /opt/bitnami/mariadb/lib/libgalera_smm.so(+0x689a3)[0x7fc57a2e89a3] /opt/bitnami/mariadb/lib/libgalera_smm.so(+0x68eeb)[0x7fc57a2e8eeb] /opt/bitnami/mariadb/lib/libgalera_smm.so(+0x918f0)[0x7fc57a3118f0] /opt/bitnami/mariadb/lib/libgalera_smm.so(+0x91b61)[0x7fc57a311b61] /opt/bitnami/mariadb/lib/libgalera_smm.so(+0x676e0)[0x7fc57a2e76e0] /opt/bitnami/mariadb/lib/libgalera_smm.so(+0x440a8)[0x7fc57a2c40a8] /opt/bitnami/mariadb/sbin/mysqld(_ZN5wsrep18wsrep_provider_v2611run_applierEPNS_21high_priority_serviceE+0xe)[0x555bfabb35ce] /opt/bitnami/mariadb/sbin/mysqld(+0xbd84f3)[0x555bfa8854f3] /opt/bitnami/mariadb/sbin/mysqld(_Z15start_wsrep_THDPv+0x2cf)[0x555bfa87603f] /opt/bitnami/mariadb/sbin/mysqld(+0xb67ffb)[0x555bfa814ffb] /lib/x86_64-linux-gnu/libpthread.so.0(+0x7fa3)[0x7fc57ab46fa3] /lib/x86_64-linux-gnu/libc.so.6(clone+0x3f)[0x7fc57a74f4cf]

Trying to get some variables. Some pointers may be invalid and cause the dump to abort. Query (0x0): (null) Connection ID (thread ID): 2 Status: NOT_KILLED

Optimizer switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=on,table_elimination=on,extended_keys=on,exists_to_in=on,orderby_uses_equalities=on,condition_pushdown_for_derived=on,split_materialized=on,condition_pushdown_for_subquery=on,rowid_filter=on,condition_pushdown_from_having=on,not_null_range_scan=off

The manual page at https://mariadb.com/kb/en/how-to-produce-a-full-stack-trace-for-mysqld/ contains information that should help you find out what is causing the crash.

We think the query pointer is invalid, but we will try to print it anyway. Query:

Writing a core file... Working directory at /bitnami/mariadb/data Resource Limits: Limit Soft Limit Hard Limit Units Max cpu time unlimited unlimited seconds Max file size unlimited unlimited bytes Max data size unlimited unlimited bytes Max stack size 8388608 unlimited bytes Max core file size 0 0 bytes Max resident set unlimited unlimited bytes Max processes unlimited unlimited processes Max open files 1048576 1048576 files Max locked memory 65536 65536 bytes Max address space unlimited unlimited bytes Max file locks unlimited unlimited locks Max pending signals 63703 63703 signals Max msgqueue size 819200 819200 bytes Max nice priority 0 0 Max realtime priority 0 0 Max realtime timeout unlimited unlimited us Core pattern: |/usr/share/apport/apport %p %s %c %d %P %E

2 Answers2

1

For now I gave up on this approach after stumbling over this paragraphs in the Redis cluster documentation:

Currently Redis Cluster does not support NATted environments and in general environments where IP addresses or TCP ports are remapped.

Docker uses a technique called port mapping: programs running inside Docker containers may be exposed with a different port compared to the one the program believes to be using. This is useful in order to run multiple containers using the same ports, at the same time, in the same server.

In order to make Docker compatible with Redis Cluster you need to use the host networking mode of Docker. Please check the --net=host option in the Docker documentation for more information.

I can only assume that Galera has the same problem as Redis. If I'm wrong, please correct me and post an answer, I'll leave the question open for that.

0

Instead of using host network mode it's possible to map ports to the host in host mode and use the host IP addresses instead of container IP addresses (where the network performance will also be better than e.g. when using a network interface with overlay network driver).

ports:
  - target: 3306
    published: 3306
    protocol: tcp
    mode: host
  - target: 4567
    published: 4567
    protocol: tcp
    mode: host
  - target: 4567
    published: 4567
    protocol: udp
    mode: host
  - target: 4568
    published: 4568
    protocol: tcp
    mode: host
  - target: 4444
    published: 4444
    protocol: tcp
    mode: host

... then in some (maybe swarm) environments it won't be possible to assign the same host port on different worker nodes, so each port needs to be unique on each DB instance.

The .cnf is also important, because the container won't be able to add a listening port on the host IP address. Instead, it needs to listen to 0.0.0.0 while the other DB instances will connect to the same port on the host IP address. Here an example with the default ports from the galera documentation and an example host ip (1.2.3.4):

wsrep_node_address = 1.2.3.4:4567
wsrep_node_incoming_address = 0.0.0.0:4567
wsrep_provider_options = 'gmcast.listen_addr=tcp://0.0.0.0:4567;ist.recv_addr=1.2.3.4:4568;ist.recv_bind=0.0.0.0:4568;'
wsrep_sst_receive_address = 1.2.3.4:4444
mac
  • 1
  • 1