Quorum options - lesser known

Some lesser known quorum options

December 1, 2024

Proxmox do not really cater much for cluster deployments at a small scale of 2-4 nodes and always assume High Availability could be put to use in their approach to the out-of-the-box configuration. It is very likely for this reason that some great features of Corosync configuration 1 are left out of the official documentation entirely.

Tip

You might want to read more on how Proxmox utilise Corosync in a separate post.

Quorum provider service

Proxmox need a quorum provider service votequorum 2 to prevent data corruption in situations when two or more partitions were to form in a cluster of which a member would be about to modify the same data unchecked by the (from the viewpoint of the modifying member) missing members (of a detached partition). This is signified by the always populated corosync.conf section:

quorum {
  provider: corosync_votequorum
}

Other key: value pairs could be specified here. One of the notable values of importance is expected_votes, in standard PVE deployment not explicit:

votequorum requires an expected_votes value to function, this can be provided in two ways. The number of expected votes will be automatically calculated when the nodelist { } section is present in corosync.conf or expected_votes can be specified in the quorum { } section.

The quorum value is then calculated as majority out of the sum of nodelist { node { quorum_votes: } } values. You can see the live calculated value on any node: 3

corosync-quorumtool 

---8<---

Votequorum information
----------------------
Expected votes:   4
Highest expected: 4
Total votes:      4
Quorum:           3  
Flags:            Quorate 

---8<---

Tip

The Proxmox-specific tooling 4 makes use of this output as well with pve status. It is also this value you are temporarily changing with pvecm expected which actually makes use of corosync-quorumtool -e.

The options

These can be added to the quorum {} section:

The two-node cluster

The option two_node: 1 is meant for clusters made up of 2 nodes, it causes each node to assume it is in the quorum ever after successfully booting up and having seeing the other node at least once. This has quite some merit considering that a disappearing node could be considered having gone down and it is therefore safe to continue operating on its own. If you run this simple cluster setup, your remaining node does not have to lose quorum when the other one is down.

Auto tie-breaker

The option auto_tie_breaker: 1 (ATB) allows two equally size partitions to decide which one retains quorum deterministically, having e.g. a 4-node cluster split into two 2-node partitions would not allow either to become quorate, but ATB allows one of these to be picked as quorate, by default the one with the lowest nodeid in the partition. This can be tweaked with tunable auto_tie_breaker_node: lowest|highest|<list of node IDs>.

This could be also your go-to option in case you are running a 2-node cluster with one of the nodes in a “master” role and the other one almost invariably off.

Last man standing

The option last_man_standing: 1 (LMS) allows to dynamically adapt to scenarios when nodes go down for prolonged periods by recalculating the expected_votes value. In a 10-node cluster where e.g. 3 nodes have not been seen for longer than a specified period (by default 10 seconds - tunable option last_man_standing_window in milliseconds), the new expected_votes value becomes 7. This can cascade down to as few as 2 nodes left being quorate. If you also enable ATB, it could go to even just down to a single node.

Warning

This option should not be used in HA clusters as implemented by Proxmox.

Notes

If you are also looking how to safely disable HA on a Proxmox cluster, there is a guide.