Skip to content

Deploy RKE2 Cluster

Deploying a Rancher RKE2 Cluster is fairly straightforward. Just run the commands in-order and pay attention to which steps apply to all machines in the cluster, the controlplanes, and the workers.

Prerequisites

This document assumes you are running Ubuntu Server 20.04 or later.

All Cluster Nodes

Assume all commands are running as root moving forward. (e.g. sudo su)

Run Updates

You will need to run these commands on every server that participates in the cluster then perform a reboot of the server PRIOR to moving onto the next section.

sudo apt update && sudo apt upgrade -y
sudo apt install nfs-common iptables nano htop -y
echo "Adding 15 Second Delay to Ensure Previous Commands finish running"
sleep 15
sudo apt autoremove -y
sudo reboot

Tip

If this is a virtual machine, now would be the best time to take a checkpoint / snapshot of the VM before moving forward, in case you need to perform rollbacks of the server(s) if you accidentally misconfigure something.

Initial ControlPlane Node

When you are starting a brand new cluster, you need to create what is referred to as the "Initial ControlPlane". This node is responsible for bootstrapping the entire cluster together in the beginning, and will eventually assist in handling container workloads and orchestrating operations in the cluster.

Warning

You only want to follow the instructions for the initial controlplane once. Running it on another machine to create additional controlplanes will cause the cluster to try to set up two different clusters, wrecking havok. Instead, follow the instructions in the next section to add redundant controlplanes.

Download the Run Server Deployment Script

curl -sfL https://get.rke2.io | INSTALL_RKE2_TYPE=server sh -

Enable & Configure Services

# Make yourself sudo
sudo su

# Start and Enable the Kubernetes Service
systemctl enable rke2-server.service
systemctl start rke2-server.service

# Symlink the Kubectl Management Command  
ln -s $(find /var/lib/rancher/rke2/data/ -name kubectl) /usr/local/bin/kubectl

# Temporarily Export the Kubeconfig to manage the cluster from CLI  
export KUBECONFIG=/etc/rancher/rke2/rke2.yaml

# Add a Delay to Allow Cluster to Finish Initializing / Get Ready
echo "Adding 60 Second Delay to Ensure Cluster is Ready - Run (kubectl get node) if the server is still not ready to know when to proceed."
sleep 60

# Check that the Cluster Node is Running and Ready
kubectl get node

Example

When the cluster is ready, you should see something like this when you run kubectl get node

This may be a good point to step away for 5 minutes, get a cup of coffee, and come back so it has a little extra time to be fully ready before moving on.

root@awx:/home/nicole# kubectl get node
NAME   STATUS   ROLES                       AGE     VERSION
awx    Ready    control-plane,etcd,master   3m21s   v1.26.12+rke2r1

Install Helm, Rancher, CertManager, Jetstack, Rancher, and Longhorn

# Install Helm  
curl -#L https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3 | bash

# Install Necessary Helm Repositories
helm repo add rancher-latest https://releases.rancher.com/server-charts/latest
helm repo add jetstack https://charts.jetstack.io
helm repo add longhorn https://charts.longhorn.io
helm repo update

# Install Cert-Manager via Helm
kubectl apply -f https://github.com/jetstack/cert-manager/releases/download/v1.6.1/cert-manager.crds.yaml

# Install Jetstack via Helm
helm upgrade -i cert-manager jetstack/cert-manager --namespace cert-manager --create-namespace

# Install Rancher via Helm
helm upgrade -i rancher rancher-latest/rancher --create-namespace --namespace cattle-system --set hostname=rancher.bunny-lab.io --set bootstrapPassword=bootStrapAllTheThings --set replicas=1

# Install Longhorn via Helm
helm upgrade -i longhorn longhorn/longhorn --namespace longhorn-system --create-namespace

Be Patient - Come back in 20 Minutes

Rancher is going to take a while to fully set itself up, things will appear broken. Depending on how many resources you gave the cluster, it may take longer or shorter. A good ballpark is giving it at least 20 minutes to deploy itself before attempting to log into the webUI at https://awx.bunny-lab.io.

If you want to keep an eye on the deployment progress, you need to run the following command: KUBECONFIG=/etc/rancher/rke2/rke2.yaml kubectl get pods --all-namespaces The output should look like how it does below:

NAMESPACE                         NAME                                                    READY   STATUS      RESTARTS        AGE
cattle-fleet-system               fleet-controller-59cdb866d7-94r2q                       1/1     Running     0               4m31s
cattle-fleet-system               gitjob-f497866f8-t726l                                  1/1     Running     0               4m31s
cattle-provisioning-capi-system   capi-controller-manager-6f87d6bd74-xx22v                1/1     Running     0               55s
cattle-system                     helm-operation-28dcp                                    0/2     Completed   0               109s
cattle-system                     helm-operation-f9qww                                    0/2     Completed   0               4m39s
cattle-system                     helm-operation-ft8gq                                    0/2     Completed   0               26s
cattle-system                     helm-operation-m27tq                                    0/2     Completed   0               61s
cattle-system                     helm-operation-qrgj8                                    0/2     Completed   0               5m11s
cattle-system                     rancher-64db9f48c-qm6v4                                 1/1     Running     3 (8m8s ago)    13m
cattle-system                     rancher-webhook-65f5455d9c-tzbv4                        1/1     Running     0               98s
cert-manager                      cert-manager-55cf8685cb-86l4n                           1/1     Running     0               14m
cert-manager                      cert-manager-cainjector-fbd548cb8-9fgv4                 1/1     Running     0               14m
cert-manager                      cert-manager-webhook-655b4d58fb-s2cjh                   1/1     Running     0               14m
kube-system                       cloud-controller-manager-awx                            1/1     Running     5 (3m37s ago)   19m
kube-system                       etcd-awx                                                1/1     Running     0               19m
kube-system                       helm-install-rke2-canal-q9vm6                           0/1     Completed   0               19m
kube-system                       helm-install-rke2-coredns-q8w57                         0/1     Completed   0               19m
kube-system                       helm-install-rke2-ingress-nginx-54vgk                   0/1     Completed   0               19m
kube-system                       helm-install-rke2-metrics-server-87zhw                  0/1     Completed   0               19m
kube-system                       helm-install-rke2-snapshot-controller-crd-q6bh6         0/1     Completed   0               19m
kube-system                       helm-install-rke2-snapshot-controller-tjk5f             0/1     Completed   0               19m
kube-system                       helm-install-rke2-snapshot-validation-webhook-r9pcn     0/1     Completed   0               19m
kube-system                       kube-apiserver-awx                                      1/1     Running     0               19m
kube-system                       kube-controller-manager-awx                             1/1     Running     5 (3m37s ago)   19m
kube-system                       kube-proxy-awx                                          1/1     Running     0               19m
kube-system                       kube-scheduler-awx                                      1/1     Running     5 (3m35s ago)   19m
kube-system                       rke2-canal-gm45f                                        2/2     Running     0               19m
kube-system                       rke2-coredns-rke2-coredns-565dfc7d75-qp64p              1/1     Running     0               19m
kube-system                       rke2-coredns-rke2-coredns-autoscaler-6c48c95bf9-fclz5   1/1     Running     0               19m
kube-system                       rke2-ingress-nginx-controller-lhjwq                     1/1     Running     0               17m
kube-system                       rke2-metrics-server-c9c78bd66-fnvx8                     1/1     Running     0               18m
kube-system                       rke2-snapshot-controller-6f7bbb497d-dw6v4               1/1     Running     4 (6m17s ago)   18m
kube-system                       rke2-snapshot-validation-webhook-65b5675d5c-tdfcf       1/1     Running     0               18m
longhorn-system                   csi-attacher-785fd6545b-6jfss                           1/1     Running     1 (6m17s ago)   9m39s
longhorn-system                   csi-attacher-785fd6545b-k7jdh                           1/1     Running     0               9m39s
longhorn-system                   csi-attacher-785fd6545b-rr6k4                           1/1     Running     0               9m39s
longhorn-system                   csi-provisioner-8658f9bd9c-58dc8                        1/1     Running     0               9m38s
longhorn-system                   csi-provisioner-8658f9bd9c-g8cv2                        1/1     Running     0               9m38s
longhorn-system                   csi-provisioner-8658f9bd9c-mbwh2                        1/1     Running     0               9m38s
longhorn-system                   csi-resizer-68c4c75bf5-d5vdd                            1/1     Running     0               9m36s
longhorn-system                   csi-resizer-68c4c75bf5-r96lf                            1/1     Running     0               9m36s
longhorn-system                   csi-resizer-68c4c75bf5-tnggs                            1/1     Running     0               9m36s
longhorn-system                   csi-snapshotter-7c466dd68f-5szxn                        1/1     Running     0               9m30s
longhorn-system                   csi-snapshotter-7c466dd68f-w96lw                        1/1     Running     0               9m30s
longhorn-system                   csi-snapshotter-7c466dd68f-xt42z                        1/1     Running     0               9m30s
longhorn-system                   engine-image-ei-68f17757-jn986                          1/1     Running     0               10m
longhorn-system                   instance-manager-fab02be089480f35c7b2288110eb9441       1/1     Running     0               10m
longhorn-system                   longhorn-csi-plugin-5j77p                               3/3     Running     0               9m30s
longhorn-system                   longhorn-driver-deployer-75fff9c757-dps2j               1/1     Running     0               13m
longhorn-system                   longhorn-manager-2vfr4                                  1/1     Running     4 (10m ago)     13m
longhorn-system                   longhorn-ui-7dc586665c-hzt6k                            1/1     Running     0               13m
longhorn-system                   longhorn-ui-7dc586665c-lssfj                            1/1     Running     0               13m

Note

Be sure to write down the "bootstrapPassword" variable for when you log into Rancher later. In this example, the password is bootStrapAllTheThings. Also be sure to adjust the "hostname" variable to reflect the FQDN of the cluster. You can leave it default like this and change it upon first login if you want. This is important for the last step where you adjust DNS. The example given is rancher.bunny-lab.io.

Log into webUI

At this point, you can log into the webUI at https://awx.bunny-lab.io using the default bootStrapAllTheThings password, or whatever password you configured, you can change the password after logging in if you need to by navigating to Home > Users & Authentication > "..." > Edit Config > "New Password" > Save. From here, you can deploy more nodes, or deploy single-node workloads such as an Ansible AWX Operator.

Rebooting the ControlNode

If you ever find yourself needing to reboot the ControlNode, and need to run kubectl CLI commands, you will need to run the command below to import the cluster credentials upon every reboot. Reboots should take much less time to get the cluster ready again as compared to the original deployments.

export KUBECONFIG=/etc/rancher/rke2/rke2.yaml

Create Additional ControlPlane Node(s)

This is the part where you can add additional controlplane nodes to add additional redundancy to the RKE2 Cluster. This is important for high-availability environments.

Download the Server Deployment Script

curl -sfL https://get.rke2.io | INSTALL_RKE2_TYPE=server sh -

Configure and Connect to Initial ControlPlane Node

# Symlink the Kubectl Management Command  
ln -s $(find /var/lib/rancher/rke2/data/ -name kubectl) /usr/local/bin/kubectl

# Manually Create a Rancher-Kubernetes-Specific Config File  
mkdir -p /etc/rancher/rke2/

# Inject IP of Initial ControlPlane Node into Config File
echo "server: https://192.168.3.21:9345" > /etc/rancher/rke2/config.yaml

# Inject the Initial ControlPlane Node trust token into the config file
# You can get the token by running the following command on the first node in the cluster: `cat /var/lib/rancher/rke2/server/node-token`
echo "token: K10aa0632863da4ae4e2ccede0ca6a179f510a0eee0d6d6eb53dca96050048f055e::server:3b130ceebfbb7ed851cd990fe55e6f3a" >> /etc/rancher/rke2/config.yaml

# Start and Enable the Kubernetes Service
systemctl enable rke2-server.service
systemctl start rke2-server.service

Note

Be sure to change the IP address of the initial controlplane node provided in the example above to match your environment.

Add Worker Node(s)

Worker nodes are the bread-and-butter of a Kubernetes cluster. They handle running container workloads, and acting as storage for the cluster (this can be configured to varying degrees based on your needs).

Download the Server Worker Script

curl -sfL https://get.rke2.io | INSTALL_RKE2_TYPE=agent sh -

Configure and Connect to RKE2 Cluster

# Manually Create a Rancher-Kubernetes-Specific Config File  
mkdir -p /etc/rancher/rke2/

# Inject IP of Initial ControlPlane Node into Config File  
echo "server: https://192.168.3.21:9345" > /etc/rancher/rke2/config.yaml

# Inject the Initial ControlPlane Node trust token into the config file  
# You can get the token by running the following command on the first node in the cluster: `cat /var/lib/rancher/rke2/server/node-token`
echo "token: K10aa0632863da4ae4e2ccede0ca6a179f510a0eee0d6d6eb53dca96050048f055e::server:3b130ceebfbb7ed851cd990fe55e6f3a" >> /etc/rancher/rke2/config.yaml

# Start and Enable the Kubernetes Service**
systemctl enable rke2-agent.service
systemctl start rke2-agent.service

DNS Server Record

You will need to set up some kind of DNS server record to point the FQDN of the cluster (e.g. rancher.bunny-lab.io) to the IP address of the Initial ControlPlane. This can be achieved in a number of ways, such as editing the Windows HOSTS file, Linux's /etc/resolv.conf file, a Windows DNS Server "A" Record, or an NGINX/Traefik Reverse Proxy.

Once you have added the DNS record, you should be able to access the login page for the Rancher RKE2 Kubernetes cluster. Use the bootstrapPassword mentioned previously to log in, then change it immediately from the user management area of Rancher.

TYPE OF ACCESS FQDN IP ADDRESS
HOST FILE rancher.bunny-lab.io 192.168.3.10
REVERSE PROXY http://rancher.bunny-lab.io:80 192.168.5.29
DNS RECORD A Record: rancher.bunny-lab.io 192.168.3.10