Rebuild Failover Cluster Replication
Purpose: If you run an environment with multiple Hyper-V: Failover Clusters, for the purpose of Hyper-V: Failover Cluster Replication via a Hyper-V Replica Broker
role installed on a host within the Failover Cluster, sometimes a GuestVM will fail to replicate itself to the replica cluster, and in those cases, it may not be able to recover on its own. This guide attempts to outline the process to rebuild replication for GuestVMs on a one-by-one basis.
Assumptions
This guide assumes you have two Hyper-V Failover Clusters, for the sake of the guide, we will refer to the Production cluster as CLUSTER-01
and the Replication cluster as CLUSTER-02
. This guide also assumes that Replication was set up beforehand, and does not include instructions on how to deploy a Replica Broker (at this time).
Production Cluster - CLUSTER-01¶
Locate the GuestVM¶
You need to start by locating the GuestVM in the Production cluster, CLUSTER-01. You will know you found the VM if the "Replication Health" is either Unhealthy
, Warning
, or Critical
.
Remove Replication from GuestVM¶
- Within a node of the Hyper-V: Failover Cluster Manager
- Right-Click the GuestVM
- Navigate to "Replication > Remove Replication"
- Confirm the removal by clicking the "Yes" button. You will know if it removed replication when the "Replication State" of the GuestVM is
Not enabled
Replication Cluster - CLUSTER-02¶
Note the storage GUID of the GuestVM in the replication cluster¶
- Within a node of the replication cluster's Hyper-V: Failover Cluster Manager
- Right-Click the same GuestVM and click "Manage..."
This will open Hyper-V Manager
- Right-Click the GuestVM and click "Settings..."
- Navigate to "ISCSI Controller"
- Click on one of the Virtual Disks attached to the replica VM, and note the full folder path for later. e.g.
C:\ClusterStorage\Volume1\HYPER-V REPLICA\VIRTUAL HARD DISKS\020C9A30-EB02-41F3-8D8B-3561C4521182
- Right-Click the same GuestVM and click "Manage..."
Noting the GUID of the GuestVM
You need to note the folder location so you have the GUID. Without the GUID, cleaning up the old storage associated with the GuestVM replica files will be much more difficult / time-consuming. Note it down somewhere safe, and reference it later in this guide.
Delete the GuestVM from the Replication Cluster¶
Now that you have noted the GUID of the storage folder of the GuestVM, we can safely move onto removing the GuestVM from the replication cluster.
- Within a node of the replication cluster's Hyper-V: Failover Cluster Manager
- Right-Click the GuestVM
- Navigate to "Replication > Remove Replication"
- Confirm the removal by clicking the "Yes" button. You will know if it removed replication when the "Replication State" of the GuestVM is
Not enabled
- Right-Click the GuestVM (again)
You will see that "Enable Replication" is an option now, indicating it was successfully removed.
Replica Checkpoint Merges
When you removed replication, there may have been replication checkpoints that automatically try to merge together with a Merge in Progress
status. Just let it finish before moving forward.
- Within the same node of the replication cluster's Hyper-V: Failover Cluster Manager
Switch back from Hyper-V Manager
- Right-Click the GuestVM and click "Remove"
- Confirm the action by clicking the "Yes" button
Delete the GuestVM manually from Hyper-V Manager on all replication cluster hosts¶
At this point in time, we need to remove the GuestVM from all of the servers in the cluster. Just because we removed it from the Hyper-V: Failover Cluster did not remove it from the cluster's nodes. We can automate part of this work by opening Hyper-V Manager on the same Failover Node we have been working on thus far, and from there we can connect the rest of the replication nodes to the manager to have one place to connect to all of the nodes, avoiding hopping between servers.
- Open Hyper-V Manager
- Right-Click "Hyper-V Manager" on the left-hand navigation menu
- Click "Connect to Server..."
- Type the names of every node in the replication cluster to connect to each of them, repeating the two steps above for every node
- Remove GuestVM from the node it appears on
- On one of the replication cluster nodes, we will see the GuestVM listed, we are going to Right-Click the GuestVM and select "Delete"
Delete the GuestVM's replicated VHDX storage from replication ClusterStorage¶
Now we need to clean up the storage left behind by the replication cluster.
- Within a node of the replication cluster
- Navigate to
C:\ClusterStorage\Volume1\HYPER-V REPLICA\VIRTUAL HARD DISKS
- Delete the entire GUID folder noted in the previous steps.
e.g. 020C9A30-EB02-41F3-8D8B-3561C4521182
- Navigate to
Production Cluster - CLUSTER-01¶
Re-Enable Replication on GuestVM in Cluster-01 (Production Cluster)¶
At this point, we have disabled replication for the GuestVM and cleaned up traces of it in the replication cluster. Now we need to re-enable replication on the GuestVM back in the production cluster.
- Within a node of the production Hyper-V: Failover Cluster Manager
- Right-Click the GuestVM
- Navigate to "Replication > Enable Replication..."
- Click "Next"
- For the "Replica Server", enter the name of the role of the Hyper-V Replica Broker role in the (replication cluster's) Failover Cluster.
e.g. CLUSTER-02-REPL
, then click "Next" - Click the "Select Certificate" button, since the Broker was configured with Certificate-based authentication instead of Kerberos (in this example environment). It will prompt you to accept the certificate by clicking "OK". (e.g.
HV Replica Root CA
), then click "Next" - Make sure every drive you want replicated is checked, then click "Next"
- Replication Frequency:
5 Minutes
, then click "Next" - Additional Recovery Points:
Maintain only the latest recovery point
, then click "Next" - Initial Replication Method:
Send initial copy over the network
- Schedule Initial Replication:
Start replication immediately
- Click "Next"
- Click "Finish"
Replication Enabled
If everything was successful, you will see a dialog box named "Enable replication for <GuestVM>
" with a message similar to the following: "Replica virtual machine <GuestVM>
was successfully created on the specified Replica server <Node-in-Replication-Cluster>
.
At this point, you can click "Close" to finish the process. Under the GuestVM details, you will see "Replication State": Initial Replication in Progress
.