Commit 011e5408 authored by Dominik Süß's avatar Dominik Süß

draft for the k3s series

parent deede91c
Pipeline #109295984 passed with stages
in 28 seconds
......@@ -8,7 +8,7 @@ defaultcontentlanguage = "en" # The default content language.
paginate = 20 # Default number of pages per page in pagination.
canonifyurls = true # Enable to turn relative URLs into absolute.
pygmentsstyle = "bw" # Color-theme or style for syntax highlighting.
pygmentsstyle = "fruity" # Color-theme or style for syntax highlighting.
pygmentscodefences = true # Enable code fence background highlighting.
pygmentscodefencesguesssyntax = true # Enable syntax guessing for code fences without specified language.
......
---
title: "Scaleable Chaos - a homelab kubernetes setup"
date: 2019-08-31T21:37:30+02:00
draft: true
categories:
- devops
- kubernetes
- wireguard
- vpn
---
Recently I decided to spend some money on a small VPS to call my own and mess
around with. As I've been dealing a lot with OpenShift at work lately, the first
idea which came to mind was to set up a small kubernetes cluster. The only
problem: one single node can hardly be considered a cluster. So I thought for a
bit and decided to build a cluster spanning three hosts across three different
locations: the VPS Provider, GCP and my home.
Previously I've been managing my home server setup using a set of ansible
playbooks which would set up sytemd units spawning docker containers. This whole
setup worked pretty well but I always wished for something better. My main
issues were the pretty verbose playbook files, a lack of propper resource limits
(this was not a technical limitation but lazyness on my side) and a nasty issue
with my home media streaming service of choice -
[Jellyfin](https://github.com/jellyfin/jellyfin).
My TV at home runs LG WebOS and since the Emby folks added some checks to
prevent the client from connecting to Jellyfin, I needed to employ a workaround
to continue to stream from my TV. I had the choice of either running some `sed`
commands every time I restart the emby container or build a custom image,
replacing the values at build time. From the moment I first had to employ this,
my immediate reaction was: _init contianers would be nice right now..._
So this blog series will document my journey to full devops automation at home
and everything that comes with it. As my setup is ever evolving, don't see this
as a guide or reference, but maybe try to get inspired to set up something like
this at your home. You can find my services, playbooks and other config files at
[https://git.sr.ht/~thesuess/k3s/tree/master/cluster](https://git.sr.ht/~thesuess/k3s/tree/master/cluster)
# Infrastructure
First of all, lets talk infrastructure. As said before, our cluster will
span three nodes across three different locations. Here is a short overview of
each one:
* `rauchhaus.suess.wtf`: VPS hosted @ [fastpipe.io](https://fastpipe.io)
* Fedora 30
* `1` Core
* `2GB` Memory
* Kubernetes Master + VPN Gateway
* `rantanplan.suess.wtf`: Dell Homeserver <!-- TODO: modell -->
* Fedora 29
* `4` Cores
* `8GB` Memory
* Media Server, NAS (software raid providing ~7.3 TB), Home Automation, MPD Server
* `ananas.suess.wtf`: `f1.micro` GCP Instance
* Fedora 30
* `1` Core
* `600MB` Memory
* Really just to see how far I can push this (+ it's free so why not)
(If you are confused by the server names, I often choose names
[based](https://www.youtube.com/watch?v=wW-9lJGmRfg) on
[songs](https://www.youtube.com/watch?v=9Qq0MqINpbY) and [artists](https://www.youtube.com/watch?v=tK8iTnL_7Ws))
## Setting up the nodes
I'm using Fedora as my base OS since I'm pretty comfortable working with RPM
based distros and am using Fedora as my daily driver for quite some time now. I
briefly considered running Fedora CoreOS for the VPS nodes but setting up VPN
would have been just too much of a bother for this project so I stayed within my
comfort zone.
Because Fedora is not availabe in the google gloud out of the box, I pretty much
followed this
[guide](https://linuxhint.com/install-fedora-google-compute-engine/) without
much deviation.
After the initial setup, the next step is to establish a connection between the
hosts.
## VPN Setup
Normally when setting up a k8s cluster, you want to have all your nodes in the
same network to allow for easier communication and contingent ranges. Since our
hosts are scattered across the internet, we need to set up a VPN. This also
doubles as a security feature for our container internal network but more on
that later. The VPN solution I'm most comfortable with is
[Wireguard](https://www.wireguard.com/). The reasoning behind this, is its
relatively simple setup, easy to understand protocol and performance.
Another important aspect is the integration into `systemd-networkd` (also one of
the reasons I'm using fedora instead of CentOS).
To bootstrap the nodes and configure VPN, I'm using
[Ansible](https://docs.ansible.com/) - again because of it's simplicity and ease
of use. You can find the playbooks and roles I'm using
[here](https://git.sr.ht/~thesuess/k3s/tree/master/cluster). The `wireguard`
role performs the following steps:
* configure the
[jdoss/wireguard](https://copr.fedorainfracloud.org/coprs/jdoss/wireguard/) copr
* install the wireguard packages
* templates the
[netdev](https://git.sr.ht/~thesuess/k3s/tree/master/cluster/roles/wireguard/templates/wireguard.netdev)
and [network](https://git.sr.ht/~thesuess/k3s/tree/master/cluster/roles/wireguard/templates/wireguard.network) files
* configure firewalld to allow incoming connections on port `3333` and allow all
traffic inside of the wireguard network.
I've choosen the IP Range `10.33.33.0/24` for the cluster network (reasons
behind the number three will become apparent later). Before running the
playbook, we need to generate the private and public keys for each node. The
simplest way to accomplish this is via a single line:
```sh
wg genkey | tee private | wg pubkey > public
```
After adding the keys to the ansible group vars and running the playbook, each
node should reach the entire `10.33.33.0` network. You can also verfiy this with
the `wg` command:
```
[[email protected] ~]# sudo wg
interface: wg0
public key: eBtICB58pdIInDiV6pCEatMw3ZUIZIobQBXW3wDTqwI=
private key: (hidden)
listening port: 3333
peer: 6GVy6YDJ2trUJKadecggSHk6ylpx89dadceEIb8f62o=
endpoint: ...:42826
allowed ips: 10.33.33.1/32
latest handshake: 34 seconds ago
transfer: 98.24 KiB received, 158.96 KiB sent
persistent keepalive: every 25 seconds
peer: 8ohU1PkdTK6ir+gGBPCJLF4ZgxtwS1VKjRKNhdNNThM=
endpoint: ...:3333
allowed ips: 10.33.33.2/32
latest handshake: 48 seconds ago
transfer: 78.57 KiB received, 129.46 KiB sent
persistent keepalive: every 25 seconds
```
In my case, one of my nodes (my home server) does not have a publicly accessible
IP (well it would have a IPv6 Address but since GCP does not support IPv6 i'm
ignoring this for now), so it does not bind on a port but simply connects to the
other nodes as a client.
# Cluster Setup
Now that our infrastructure is ready, we can finally start to set up the
Cluster. Because of our limited resources, I decided against upstream kubernetes
and am using [k3s](https://k3s.io/). This variant allows for deployment on low
memory machines and removes some unneccesary bloat. Another benefit is the
simpler setup which makes the bootstrapping a breeze.
k3s offers a simple install script at `get.k3s.io` but to make it even easier to
add new nodes to the cluster I also replicated this in
[ansible](https://git.sr.ht/~thesuess/k3s/tree/master/cluster/roles/k3s).
This role only downloads the k3s binary and then builds a systemd unit to start
it. It's imporant to hardcode the `--node-ip` and the `--flanel-iface=wg0` since
k3s tries to be smart and chooses the wrong address and interface by inspecting
the default route.
For the master node, I also chose to disable the integrated `traefik` deployment
as I want to set up custom ingresses later on.
## Starting the cluster
Since the worker nodes need a generated token to connect to the master, the
first step is to only start the master. A file will be generated at
`/var/lib/rancher/k3s/server/node-token`. Paste the contents of this file into
the ansible vault and run the playbook again to start the agents.
If all went well, you should be able to successfully list your nodes from the master:
```
[[email protected] ~]# kubectl get nodes
NAME STATUS ROLES AGE VERSION
ananas.suess.wtf Ready worker 6d13h v1.14.6-k3s.1
rantanplan.suess.wtf Ready worker 7d16h v1.14.6-k3s.1
rauchhaus.suess.wtf Ready master 7d16h v1.14.6-k3s.1
```
To use kubectl on your local host, copy the file `/etc/rancher/k3s/k3s.yaml` to
`~/.kube/config` (or merge it with your existing kubeconfig). If you frequently
need to connect to different clusters, I reccomend [kubectx +
kubens](https://github.com/ahmetb/kubectx).
## Testing the cluster
To test the cluster, we can spawn the following daemonset which will spawn a
small pod an all nodes
```yaml
---
kind: 'DaemonSet'
apiVersion: 'apps/v1'
metadata:
namespace: 'default'
name: 'whoami'
labels:
app: 'whoami'
spec:
selector:
matchLabels:
app: 'whoami'
template:
metadata:
labels:
app: 'whoami'
spec:
containers:
- name: 'whoami'
image: 'containous/whoami'
ports:
- name: 'web'
containerPort: 80
```
And after a few seconds you can see that all pods have been created:
```
$ kubectl get pods --selector=app=whoami
NAME READY STATUS RESTARTS AGE
whoami-4zx6t 1/1 Running 0 17s
whoami-cg7rw 1/1 Running 0 17s
whoami-n6xz9 1/1 Running 0 17s
```
---
This concludes the
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment