Proxmox is a platform that allows you to manage virtual machines trough a web interface. It offers a lot of features like high-availability with automatic failover. This is what I will be setting up in this post
Requirements
A Proxmox cluster requires at least 3 machines for high-availability to work, additionally, storage replication requires ZFS.
I will make 3 vm's: vmpve1, vmpve2, vmpve3 with IP addresses 172.16.0.211 trough 172.16.0.213
I will assume you installed Proxmox using ZFS as the storage format, since we will need ZFS for replication.
A few notes on my test setup
Since I don't have 3 machines laying around to test with, I decided to set up "Nested virtualization" on my single machine I have in my homelab. This allows me to play with redundancy features, without actually needing 3 separate real machines.
Nested virtualization is not recommended for scenarios that are not testing, since performance will be less than ideal, but good enough for testing.
Creating the cluster
Log in to the first proxmox machine, click on datacenter > cluster and then on the "create cluster" button. Give the cluster a name (like "MyCluster") and click "create"
Next, click on the button "Join information" and copy the join information, we will need this to add the other nodes.
Next, log in onto the second node, and go to "Datacenter" > "Cluster" and click "Join cluster", paste in the join information, after which you will see a new field to enter the root password of the primary node.
Do the same for the third node
If you go to either node, you will see you can manage them all from one
We now have a cluster.
Setting up replication and high-availability
Before we can begin, you will need to make at least one VM, I will be installing a simple Debian VM with a webserver on it, but this can be anything you like.
Once you have a VM installed, go to the VM, and select "replication". We will create two jobs: One to the second node, and one to the third node.
This will mirror the VM disk to the other nodes, every 5 minutes. Click "Schedule now" on both jobs (or wait a few minutes) until the VM disk has replicated to both nodes.
Next, we can click "more" > "manage HA" to configure high availability on this VM
Once this is done, we can test our cluster functionality
Test 1: Live migration
If we click the "migrate" button when we select a VM, we can live migrate a VM to another node, I will migrate my VM from node 1 to node 2 in this example.
The vm will now run on node 2, without rebooting (although you might notice a very brief moment with packet loss (few seconds))
Test 2: Failover
Next, we can test failover by unplugging the network on the node the VM currently runs on.
After a few minutes, you will notice that the VM will be started on another node, in this case there will be downtime for a few minutes and the VM will be booted from the last replicated hard disk, but the VM will start automatically again and downtime is minimized.