Deploy RKE2 Cluster¶
Deploying a Rancher RKE2 Cluster is fairly straightforward. Just run the commands in-order and pay attention to which steps apply to all machines in the cluster, the controlplanes, and the workers.
Prerequisites
This document assumes you are running Ubuntu Server 20.04 or later.
All Cluster Nodes¶
Assume all commands are running as root moving forward. (e.g. sudo su
)
Run Updates¶
You will need to run these commands on every server that participates in the cluster then perform a reboot of the server PRIOR to moving onto the next section.
sudo apt update && sudo apt upgrade -y
sudo apt install nfs-common iptables nano htop -y
echo "Adding 15 Second Delay to Ensure Previous Commands finish running"
sleep 15
sudo apt autoremove -y
sudo reboot
Tip
If this is a virtual machine, now would be the best time to take a checkpoint / snapshot of the VM before moving forward, in case you need to perform rollbacks of the server(s) if you accidentally misconfigure something.
Initial ControlPlane Node¶
When you are starting a brand new cluster, you need to create what is referred to as the "Initial ControlPlane". This node is responsible for bootstrapping the entire cluster together in the beginning, and will eventually assist in handling container workloads and orchestrating operations in the cluster.
Warning
You only want to follow the instructions for the initial controlplane once. Running it on another machine to create additional controlplanes will cause the cluster to try to set up two different clusters, wrecking havok. Instead, follow the instructions in the next section to add redundant controlplanes.
Download the Run Server Deployment Script¶
Enable & Configure Services¶
# Make yourself sudo
sudo su
# Start and Enable the Kubernetes Service
systemctl enable rke2-server.service
systemctl start rke2-server.service
# Symlink the Kubectl Management Command
ln -s $(find /var/lib/rancher/rke2/data/ -name kubectl) /usr/local/bin/kubectl
# Temporarily Export the Kubeconfig to manage the cluster from CLI
export KUBECONFIG=/etc/rancher/rke2/rke2.yaml
# Add a Delay to Allow Cluster to Finish Initializing / Get Ready
echo "Adding 60 Second Delay to Ensure Cluster is Ready - Run (kubectl get node) if the server is still not ready to know when to proceed."
sleep 60
# Check that the Cluster Node is Running and Ready
kubectl get node
Example
When the cluster is ready, you should see something like this when you run kubectl get node
This may be a good point to step away for 5 minutes, get a cup of coffee, and come back so it has a little extra time to be fully ready before moving on.
Install Helm, Rancher, CertManager, Jetstack, Rancher, and Longhorn¶
# Install Helm
curl -#L https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3 | bash
# Install Necessary Helm Repositories
helm repo add rancher-latest https://releases.rancher.com/server-charts/latest
helm repo add jetstack https://charts.jetstack.io
helm repo add longhorn https://charts.longhorn.io
helm repo update
# Install Cert-Manager via Helm
kubectl apply -f https://github.com/jetstack/cert-manager/releases/download/v1.6.1/cert-manager.crds.yaml
# Install Jetstack via Helm
helm upgrade -i cert-manager jetstack/cert-manager --namespace cert-manager --create-namespace
# Install Rancher via Helm
helm upgrade -i rancher rancher-latest/rancher --create-namespace --namespace cattle-system --set hostname=rancher.bunny-lab.io --set bootstrapPassword=bootStrapAllTheThings --set replicas=1
# Install Longhorn via Helm
helm upgrade -i longhorn longhorn/longhorn --namespace longhorn-system --create-namespace
Be Patient - Come back in 20 Minutes
Rancher is going to take a while to fully set itself up, things will appear broken. Depending on how many resources you gave the cluster, it may take longer or shorter. A good ballpark is giving it at least 20 minutes to deploy itself before attempting to log into the webUI at https://awx.bunny-lab.io.
If you want to keep an eye on the deployment progress, you need to run the following command: KUBECONFIG=/etc/rancher/rke2/rke2.yaml kubectl get pods --all-namespaces
The output should look like how it does below:
NAMESPACE NAME READY STATUS RESTARTS AGE
cattle-fleet-system fleet-controller-59cdb866d7-94r2q 1/1 Running 0 4m31s
cattle-fleet-system gitjob-f497866f8-t726l 1/1 Running 0 4m31s
cattle-provisioning-capi-system capi-controller-manager-6f87d6bd74-xx22v 1/1 Running 0 55s
cattle-system helm-operation-28dcp 0/2 Completed 0 109s
cattle-system helm-operation-f9qww 0/2 Completed 0 4m39s
cattle-system helm-operation-ft8gq 0/2 Completed 0 26s
cattle-system helm-operation-m27tq 0/2 Completed 0 61s
cattle-system helm-operation-qrgj8 0/2 Completed 0 5m11s
cattle-system rancher-64db9f48c-qm6v4 1/1 Running 3 (8m8s ago) 13m
cattle-system rancher-webhook-65f5455d9c-tzbv4 1/1 Running 0 98s
cert-manager cert-manager-55cf8685cb-86l4n 1/1 Running 0 14m
cert-manager cert-manager-cainjector-fbd548cb8-9fgv4 1/1 Running 0 14m
cert-manager cert-manager-webhook-655b4d58fb-s2cjh 1/1 Running 0 14m
kube-system cloud-controller-manager-awx 1/1 Running 5 (3m37s ago) 19m
kube-system etcd-awx 1/1 Running 0 19m
kube-system helm-install-rke2-canal-q9vm6 0/1 Completed 0 19m
kube-system helm-install-rke2-coredns-q8w57 0/1 Completed 0 19m
kube-system helm-install-rke2-ingress-nginx-54vgk 0/1 Completed 0 19m
kube-system helm-install-rke2-metrics-server-87zhw 0/1 Completed 0 19m
kube-system helm-install-rke2-snapshot-controller-crd-q6bh6 0/1 Completed 0 19m
kube-system helm-install-rke2-snapshot-controller-tjk5f 0/1 Completed 0 19m
kube-system helm-install-rke2-snapshot-validation-webhook-r9pcn 0/1 Completed 0 19m
kube-system kube-apiserver-awx 1/1 Running 0 19m
kube-system kube-controller-manager-awx 1/1 Running 5 (3m37s ago) 19m
kube-system kube-proxy-awx 1/1 Running 0 19m
kube-system kube-scheduler-awx 1/1 Running 5 (3m35s ago) 19m
kube-system rke2-canal-gm45f 2/2 Running 0 19m
kube-system rke2-coredns-rke2-coredns-565dfc7d75-qp64p 1/1 Running 0 19m
kube-system rke2-coredns-rke2-coredns-autoscaler-6c48c95bf9-fclz5 1/1 Running 0 19m
kube-system rke2-ingress-nginx-controller-lhjwq 1/1 Running 0 17m
kube-system rke2-metrics-server-c9c78bd66-fnvx8 1/1 Running 0 18m
kube-system rke2-snapshot-controller-6f7bbb497d-dw6v4 1/1 Running 4 (6m17s ago) 18m
kube-system rke2-snapshot-validation-webhook-65b5675d5c-tdfcf 1/1 Running 0 18m
longhorn-system csi-attacher-785fd6545b-6jfss 1/1 Running 1 (6m17s ago) 9m39s
longhorn-system csi-attacher-785fd6545b-k7jdh 1/1 Running 0 9m39s
longhorn-system csi-attacher-785fd6545b-rr6k4 1/1 Running 0 9m39s
longhorn-system csi-provisioner-8658f9bd9c-58dc8 1/1 Running 0 9m38s
longhorn-system csi-provisioner-8658f9bd9c-g8cv2 1/1 Running 0 9m38s
longhorn-system csi-provisioner-8658f9bd9c-mbwh2 1/1 Running 0 9m38s
longhorn-system csi-resizer-68c4c75bf5-d5vdd 1/1 Running 0 9m36s
longhorn-system csi-resizer-68c4c75bf5-r96lf 1/1 Running 0 9m36s
longhorn-system csi-resizer-68c4c75bf5-tnggs 1/1 Running 0 9m36s
longhorn-system csi-snapshotter-7c466dd68f-5szxn 1/1 Running 0 9m30s
longhorn-system csi-snapshotter-7c466dd68f-w96lw 1/1 Running 0 9m30s
longhorn-system csi-snapshotter-7c466dd68f-xt42z 1/1 Running 0 9m30s
longhorn-system engine-image-ei-68f17757-jn986 1/1 Running 0 10m
longhorn-system instance-manager-fab02be089480f35c7b2288110eb9441 1/1 Running 0 10m
longhorn-system longhorn-csi-plugin-5j77p 3/3 Running 0 9m30s
longhorn-system longhorn-driver-deployer-75fff9c757-dps2j 1/1 Running 0 13m
longhorn-system longhorn-manager-2vfr4 1/1 Running 4 (10m ago) 13m
longhorn-system longhorn-ui-7dc586665c-hzt6k 1/1 Running 0 13m
longhorn-system longhorn-ui-7dc586665c-lssfj 1/1 Running 0 13m
Note
Be sure to write down the "bootstrapPassword" variable for when you log into Rancher later. In this example, the password is bootStrapAllTheThings
. Also be sure to adjust the "hostname" variable to reflect the FQDN of the cluster. You can leave it default like this and change it upon first login if you want. This is important for the last step where you adjust DNS. The example given is rancher.bunny-lab.io
.
Log into webUI¶
At this point, you can log into the webUI at https://awx.bunny-lab.io using the default bootStrapAllTheThings
password, or whatever password you configured, you can change the password after logging in if you need to by navigating to Home > Users & Authentication > "..." > Edit Config > "New Password" > Save. From here, you can deploy more nodes, or deploy single-node workloads such as an Ansible AWX Operator.
Rebooting the ControlNode¶
If you ever find yourself needing to reboot the ControlNode, and need to run kubectl CLI commands, you will need to run the command below to import the cluster credentials upon every reboot. Reboots should take much less time to get the cluster ready again as compared to the original deployments.
Create Additional ControlPlane Node(s)¶
This is the part where you can add additional controlplane nodes to add additional redundancy to the RKE2 Cluster. This is important for high-availability environments.
Download the Server Deployment Script¶
Configure and Connect to Initial ControlPlane Node¶
# Symlink the Kubectl Management Command
ln -s $(find /var/lib/rancher/rke2/data/ -name kubectl) /usr/local/bin/kubectl
# Manually Create a Rancher-Kubernetes-Specific Config File
mkdir -p /etc/rancher/rke2/
# Inject IP of Initial ControlPlane Node into Config File
echo "server: https://192.168.3.21:9345" > /etc/rancher/rke2/config.yaml
# Inject the Initial ControlPlane Node trust token into the config file
# You can get the token by running the following command on the first node in the cluster: `cat /var/lib/rancher/rke2/server/node-token`
echo "token: K10aa0632863da4ae4e2ccede0ca6a179f510a0eee0d6d6eb53dca96050048f055e::server:3b130ceebfbb7ed851cd990fe55e6f3a" >> /etc/rancher/rke2/config.yaml
# Start and Enable the Kubernetes Service
systemctl enable rke2-server.service
systemctl start rke2-server.service
Note
Be sure to change the IP address of the initial controlplane node provided in the example above to match your environment.
Add Worker Node(s)¶
Worker nodes are the bread-and-butter of a Kubernetes cluster. They handle running container workloads, and acting as storage for the cluster (this can be configured to varying degrees based on your needs).
Download the Server Worker Script¶
Configure and Connect to RKE2 Cluster¶
# Manually Create a Rancher-Kubernetes-Specific Config File
mkdir -p /etc/rancher/rke2/
# Inject IP of Initial ControlPlane Node into Config File
echo "server: https://192.168.3.21:9345" > /etc/rancher/rke2/config.yaml
# Inject the Initial ControlPlane Node trust token into the config file
# You can get the token by running the following command on the first node in the cluster: `cat /var/lib/rancher/rke2/server/node-token`
echo "token: K10aa0632863da4ae4e2ccede0ca6a179f510a0eee0d6d6eb53dca96050048f055e::server:3b130ceebfbb7ed851cd990fe55e6f3a" >> /etc/rancher/rke2/config.yaml
# Start and Enable the Kubernetes Service**
systemctl enable rke2-agent.service
systemctl start rke2-agent.service
DNS Server Record¶
You will need to set up some kind of DNS server record to point the FQDN of the cluster (e.g. rancher.bunny-lab.io
) to the IP address of the Initial ControlPlane. This can be achieved in a number of ways, such as editing the Windows HOSTS
file, Linux's /etc/resolv.conf
file, a Windows DNS Server "A" Record, or an NGINX/Traefik Reverse Proxy.
Once you have added the DNS record, you should be able to access the login page for the Rancher RKE2 Kubernetes cluster. Use the bootstrapPassword
mentioned previously to log in, then change it immediately from the user management area of Rancher.
TYPE OF ACCESS | FQDN | IP ADDRESS |
---|---|---|
HOST FILE | rancher.bunny-lab.io | 192.168.3.10 |
REVERSE PROXY | http://rancher.bunny-lab.io:80 | 192.168.5.29 |
DNS RECORD | A Record: rancher.bunny-lab.io | 192.168.3.10 |