k3s
k3s is a lightweight version of Kubernetes designed for unattended workloads.
Installation
A minimum of 2 nodes required per cluster. k3s does not use etcd, so 3 node isn't required. Uses PostgreSQL instead.
K3OS
Use the K3OS image to quickly build a k3s cluster. Download the iso and boot it on at least 2 nodes. Log in as rancher
and start the installer with sudo os-config
. The first node is built as a server (controller), with subsequent nodes as agents (workers). When building a server, specify a token. Otherwise, you will need to find the randomly generated token after installation at /var/lib/rancher/k3s/server/node-token
:
node1 [/home/rancher]$ cat /var/lib/rancher/k3s/server/node-token
K1061d5fbfbc0952e856faaf32da12cc8718b8991dc2e03a96e0a07348455f305b5::node:4e5800b39e59863b7126eb1c88bb8957
The last value (4e5800b39e59863b7126eb1c88bb8957
) is the token which you need to use when setting up agent nodes.
After the server node is running, you may obtain a kubeconfg file at /etc/rancher/k3s/k3s.yaml
. Rename it to ~/.kube/config
and ensure the server resolves properly.
Stuff
Label nodes:
$ kubectl label node node1 kubernetes.io/role=master
$ kubectl label node node1 node-role.kubernetes.io/master=""
$ kubectl label node nodeN kubernetes.io/role=node
$ kubectl label node nodeN node-role.kubernetes.io/node=""
Troubleshooting
I kept on getting Get hxxps://acme-v02.api.letsencrypt.org/directory: x509: certificate is valid for 1c00f28ec31e740b40f7d89acf5ff01c.931313b93244347524059db81dadaf52.traefik.default, not acme-v02.api.letsencrypt.org
when trying to get ACME certs:
{"level":"error","msg":"Unable to obtain ACME certificate for domains \"load.home.steamr.com\" detected thanks to rule \"Host:load.home.steamr.com\" : cannot get ACME client get directory at 'https://acme-v02.api.letsencrypt.org/directory': Get https://acme-v02.api.letsencrypt.org/directory: x509: certificate is valid for 1c00f28ec31e740b40f7d89acf5ff01c.931313b93244347524059db81dadaf52.traefik.default, not acme-v02.api.letsencrypt.org","time":"2019-11-10T03:59:16Z"}
{"level":"info","msg":"Updated status on ingress default/load-ingress","time":"2019-11-10T03:59:17Z"}
{"level":"info","msg":"Updated status on ingress default/load-ingress","time":"2019-11-10T03:59:17Z"}
{"level":"warning","msg":"Error checking new version: Get https://update.traefik.io/repos/containous/traefik/releases: x509: certificate is valid for 1c00f28ec31e740b40f7d89acf5ff01c.931313b93244347524059db81dadaf52.traefik.default, not update.traefik.io","time":"2019-11-10T04:08:58Z"}
This occured for both the ACME in traefik and from cert-manager. It turns out, the DNS setup on my network is just broken which lead to this problem. I had a search domain set which override any lookups as a subdomain of home.steamr.com.
Verify that a lookup to acme-v02.api.letsencrypt.org
resolves. If it doesn't, or if it requires a .
at the end, it means your DNS is broken.