Galera cluster performance, node count, and where to write?

Question

As far I understand Galera does not scale write performance, best case it does not degrade it, compared to one single server, so we use Galera for HA. (read performance may benefit and it can be load balanced across every nodes)

In Galera the write performance will be the performance of the weakest node. In light of that, who wants to got more HA than 3-4 nodes can provide? If one or two nodes fail the write performance will not degrade, even it is possible it increasing... so it seems to over guard HA with more than 4 nodes. (I am aware the WAN use case and multiple datacenters, but that case we have two or mode Galera clusters)

It is not clear for me, if given a particular write transaction, is there any significant load difference on node that actually the client use to write, compared to the load on all other nodes, who also needs to complete synchronously the very same transaction?

I try to decide, that should I have a dedicated node to write (btw it is single point of error, so I should use some infra to react), and only use the others to read, or try to load balance the writes also across all nodes?

How many nodes are optimal? Is it depends on my application read load/write load ratio?

score 1 · Answer 1 · answered Jan 27 '23 at 07:08

performance...more HA than 3-4 nodes can provide?

Huh? I don't see "performance" and "HA" as being equivalent.

If one or two nodes fail the write performance will not degrade,

When 2 nodes fail, a 3-node system will stop allowing writes. A 5-node cluster can withstand 2 simultaneous failures. (For 4 nodes, it depends on the weights given to the nodes.)

Also, all activity suffers some degradation while a node is coming back up, especially when SST is required.

have two or more Galera clusters

But how do you replicate between the Clusters? Async gives you some scaling, but not necessarily more HA.

given a particular write transaction, is there any significant load difference on a node that the client actually uses to write, compared to the load on all other nodes,

It depends. On the originating node, there may be a lot of extra effort to, say, locate the rows to Update or Delete. The other nodes will be told exactly which rows to change, if any.

should I have a dedicated node to write

Some people do such. With a Proxy in front of the nodes, you can 'failover' writes to a different node rather fast. So, I don't call it a "single point of error".

Read scaling can be achieved with arbitrarily many read-only nodes asynchronously replicating off of various cluster nodes.

How many nodes are optimal?

Probably 3. More than 5 can be a problem due to all the network connections that need to exist.

Is it depends on my application's read load/write load ratio?

Not much depends on that ratio.

If HA is the main goal, you want to use 3 different data centers to avoid natural disasters that could wipe out (or knock offline) a whole data center.

If scaling writes is a big problem, please describe the application. There are techniques that may help significantly.

Galera cluster performance, node count, and where to write?

1 Answers1