In Windows Server 2016 MS at last presented a workgroup cluster mode – the mode in which cluster nodes do not need to be domain members. Workgroup clusters can be useful for small companies that would like to make some of their resources highly available but would not like to deploy a domain controller just to create a domain for the nodes of the cluster. In this post I’ll show you how I deployed my workgroup cluster and how it did work in my test lab.
I’m going to create the cluster using the two clustered disks: Cluster (quorum witness disk) and GuestCluster1 (cluster data disk, Guest means the guest cluster as my cluster nodes are virtual machines Node1 and Node2 respectively). As you already know there’re many options when it comes to configuring cluster shared storage (for example iSCSI, FC SAN) – in my tests I prefer using either iSCSI for physical clusters or shared VHD file for the guest clusters, so my cluster shared storage are the two shared virtual disks and before we can start creating the cluster these disks must be brought online, initialized and formatted on both nodes:
As I’m going to create a workgroup cluster I should make several additional steps: as long as there will not be the Active Directory and DNS servers I should define the primary domain suffix and add the FQDNs of the nodes and the cluster resources to the hosts file on both nodes:
The next step is to make sure the cluster nodes can authenticate each other using NTLM: for this I set the same password for the built-in Administrator account on Node1 and Node2 (you can use any other local administrative user accounts with the same passwords on your nodes but that would need additional configuration).
After creating the cluster I’m going to add the File Server cluster role with the ClusterFS name and the IP address of 22.214.171.124 (IP = 126.96.36.199 will be assigned to the cluster itself).
Now we can proceed to creating the cluster. First I add the File Server and Failover Cluster roles (without File Server role it won’t be possible to add a File Server cluster role in FC Manager) to Node1:
The same services should be installed on Node2.
And here’s the first gotcha – I ran this wizard many times and was unable to correctly interpret this error: sometimes it disappered after 10-15 minutes (I just clicked Next again), sometimes deleting the domain name – Cloud.com – could help in moving on to the next window, sometimes I managed to go further by using ip addresses instead of the host names… It seems that the main cure for this was not ip or short names but the time elapsed from opening the Select Servers window to the moment I was pressing the Next button: the more time elapsed, the more chances it would work.
As you can see the wizard has automatically chosen the smallest disk (Y:, 2GB) as a disk Witness in the quorum so I don’t need configure quorum settings manually. Should you want to do it for some reason you can do it by right-clicking the cluster name\More Actions\Configure Cluster Quorum settings…
And because I’m using public network addresses for both networks (which is not best practice as cluster networks should be the private networks) I’ll configure network metrics manually:
Get-ClusterNetwork | ft Name, Metric, Autometric, Role
(Get-ClusterNetwork “Cluster Communications Network”).Metric = 500
(Get-ClusterNetwork “Clients Network”).Metric = 900
The goal of this is to force the cluster to use the Cluster Communications Network (188.8.131.52) for cluster communications (network with the lowest metric is the preffered network for cluster communications).
On MS technet forum I was told that in this case the share should be created manually on the cluster disk, but it’s still unclear to me if it’s by design (but not documented!) or is just a workaround:
After shutting down Node2 the ClusterFS role failed over to Node1 and the Test1 file share remained accessible, so this file share is the really highly-available file share.
You can make your apllications/resources highly available using workgroup clusters but the process may be different from that of a general (domain-joined) cluster. In the next part I’ll test authoritative and non-authoritative restore of the workgroup cluster.