Zend certified PHP/Magento developer

k3s slave agents don’t join kubernetes cluster (in Vagrantbox setup)

I’m trying to setup a k3s cluster in Virtualbox VMs using Vagrant for learning purposes. But my slave agents don’t connect to my master node. My VM provider is Virtualbox.

I’m following the instructions from the quick start guide -> https://rancher.com/docs/k3s/latest/en/quick-start

Master node setup (script = provision/master-node.sh) … I copy the token to the Vagrant folder to make it available to all other Vagrantboxes. They share the same Vagrantfile. The token-file is present in all nodes when the script is invoked (I checked):

k3sTokenFile="/var/lib/rancher/k3s/server/node-token"

echo "[INFO] Install k3s on master-node"
curl -sfL https://get.k3s.io | K3S_KUBECONFIG_MODE="644" K3S_NODE_NAME="$HOSTNAME" sh -
echo "[INFO] Get K3S_TOKEN from master-node"
cp "$k3sTokenFile" /vagrant/resources/.generated/k3s.token

The master node setup on its own works. I can deploy applications and access via browser (I deployed the dashboard application).

Slave node setup (script = provision/worker-node.sh) … The IP for the master node is correct (see Vagrantfile at the end of the post):

echo "[INFO] Read K3S_TOKEN from filesystem"
tmp=$(</vagrant/resources/.generated/k3s.token)
k3sToken=${tmp%%*( )} # trim witespaces

masterIP="192.168.30.10"

echo "[INFO] Install k3s on worker-node and join cluster"
curl -sfL https://get.k3s.io | K3S_URL="https://$masterIP:6443" K3S_TOKEN="$k3sToken" K3S_NODE_NAME="$HOSTNAME" sh -

This is the log output for slave agents setup:

v-k3s-worker-3: [INFO] Install k3s on worker-node and join cluster
v-k3s-worker-3: [INFO]  Finding release for channel stable
v-k3s-worker-3: [INFO]  Using v1.21.3+k3s1 as release
v-k3s-worker-3: [INFO]  Downloading hash https://github.com/k3s-io/k3s/releases/download/v1.21.3+k3s1/sha256sum-amd64.txt
v-k3s-worker-3: [INFO]  Downloading binary https://github.com/k3s-io/k3s/releases/download/v1.21.3+k3s1/k3s
v-k3s-worker-3: [INFO]  Verifying binary download
v-k3s-worker-3: [INFO]  Installing k3s to /usr/local/bin/k3s
v-k3s-worker-3: [INFO]  Creating /usr/local/bin/kubectl symlink to k3s
v-k3s-worker-3: [INFO]  Creating /usr/local/bin/crictl symlink to k3s
v-k3s-worker-3: [INFO]  Creating /usr/local/bin/ctr symlink to k3s
v-k3s-worker-3: [INFO]  Creating killall script /usr/local/bin/k3s-killall.sh
v-k3s-worker-3: [INFO]  Creating uninstall script /usr/local/bin/k3s-agent-uninstall.sh
v-k3s-worker-3: [INFO]  env: Creating environment file /etc/systemd/system/k3s-agent.service.env
v-k3s-worker-3: [INFO]  systemd: Creating service file /etc/systemd/system/k3s-agent.service
v-k3s-worker-3: [INFO]  systemd: Enabling k3s-agent unit
v-k3s-worker-3: Created symlink /etc/systemd/system/multi-user.target.wants/k3s-agent.service → /etc/systemd/system/k3s-agent.service.
v-k3s-worker-3: [INFO]  systemd: Starting k3s-agent

The node status after all Vagrantboxes are up and running and all setup scripts finished:

$ vagrant ssh v-k3s-master
$ sudo kubectl --kubeconfig /etc/rancher/k3s/k3s.yaml get nodes
NAME           STATUS   ROLES                  AGE   VERSION
v-k3s-master   Ready    control-plane,master   59m   v1.21.3+k3s1

When I access https://localhost:6443 from my host machine via browser, I get a warning due to the SSL/HTTPS settings not being trustworthy (Error code: SEC_ERROR_UNKNOWN_ISSUER). The response comes from the master node (only possibility since there is no slave agent connected). As far as I understand, https://localhost:6443 is the address of choice for slave agents to join the cluster.

Since I can telnet from a slave VM into the master VM, network communication should work but curl shows errors. Plus the master-vm is reachable by IP only, not its hostname:

vagrant@v-k3s-worker-3:~$ telnet 192.168.30.10 6443
Trying 192.168.30.10...
Connected to 192.168.30.10.
Escape character is '^]'.
^CConnection closed by foreign host.

vagrant@v-k3s-worker-3:~$ telnet v-k3s-master 6443
telnet: could not resolve v-k3s-master/6443: Temporary failure in name resolution

vagrant@v-k3s-worker-3:~$ curl https://192.168.30.10:6443
curl: (60) SSL certificate problem: unable to get local issuer certificate
More details here: https://curl.haxx.se/docs/sslcerts.html
curl failed to verify the legitimacy of the server and therefore could not
establish a secure connection to it. To learn more about this situation and
how to fix it, please visit the web page mentioned above.
vagrant@v-k3s-worker-3:~$ 

My full Vagrantfile:

# -*- mode: ruby -*-
# vi: set ft=ruby :

IMAGE_NAME = "ubuntu/focal64" # focal = 20.04. LTS | bento/ubuntu-16.04
N = 3 # number of worker nodes

Vagrant.configure("2") do |config|
    config.ssh.insert_key = false

    # common for all vagrant boxes
    config.vm.provider "virtualbox" do |v|
        v.memory = 2048
        v.cpus = 2
        v.customize ["modifyvm", :id, "--groups", "/v-kube-cluster"]
    end

    # master node
    config.vm.define "v-k3s-master" do |master|
        master.vm.box = IMAGE_NAME
        master.vm.network "private_network", ip: "192.168.30.10"
        master.vm.hostname = "v-k3s-master"

        master.vm.provider "virtualbox" do |v|
            v.name = "v-k3s-master"
        end

        master.vm.network "forwarded_port", guest: 6443, host: 6443
        master.vm.network "forwarded_port", guest: 8001, host: 8001

        master.vm.provision "shell", path: "../common/provision/bash-setup.sh"
        master.vm.provision "shell", path: "../common/provision/install.sh"
        master.vm.provision "shell", path: "provision/master-node.sh"
    end

    # worker nodes
    (1..N).each do |i|
        config.vm.define "v-k3s-worker-#{i}" do |worker|
            worker.vm.box = IMAGE_NAME
            worker.vm.network "private_network", ip: "192.168.30.#{i + 10}"
            worker.vm.hostname = "v-k3s-worker-#{i}"

            worker.vm.provider "virtualbox" do |v|
                v.name = "v-k3s-worker-#{i}"
            end

            worker.vm.provision "shell", path: "../common/provision/bash-setup.sh"
            worker.vm.provision "shell", path: "../common/provision/install.sh"
            worker.vm.provision "shell", path: "provision/worker-node.sh"
        end
    end
end

I don’t know how to solve the issue. I suspect the SSL settings are the problem, but I don’t know how tot tackle the issue.

Or maybe it is an issue, that each VM can connect to each other by IP only and not via the machine name (= hostname)? So there might be a DNS issue?

To see all scripts etc. see https://gitlab.com/sommerfeld.sebastian/v-kube-cluster/-/tree/feat/k3s/src/main/k3s … This is the repo with my project (URL points to relevant directory).