Building a Kubernetes Cluster in VirtualBox with Ubuntu
Lately I’ve been trying out various ways to set up a Kubernetes cluster. So far the easiest has been just to go to Google Cloud’s container engine (GKE) and provision myself a new cluster. No fuss, no muss (except you have to remember to add a role binding so you can see the dashboard if you enable RBAC). I haven’t (yet) tried Azure. AWS works when you have full control but setting up a cluster within the constraints of enterprise checks and balances makes it difficult and you often can’t use the Cloud Provider stuff for auto-provisioning.
So next I tried using lxd
on my Ubuntu tower (64GB RAM, 1TB disk, Ubuntu 17.04). This failed not once, not twice, but well over a dozen times. Once it failed so badly that I had to completely re-image my OS install. The best I ever managed with any variation of lxd
+ conjure up
was a cluster where half the pods were caught in a fail/crash loop.
Sick and tired of bad assumptions being made on my behalf, I fired up VirtualBox and created a new VM with 2 CPUs and 2GB RAM. (2 CPUs for the master is important!). Before I turned on the machine, I created a host-only network called vboxnet0
with an IP address of 192.168.99.1. My VirtualBox networking abilities are a bit rusty so I followed some instructions I found here to set up the host-only network.
On this new machine (called kubemaster
) I installed Ubuntu 16.04.3 LTS. You can download the .iso
directly from the source and just mount it in VirtualBox and go through the installation. I didn’t choose any fancy options and kept the defaults it asked me for during install.
On this machine I also disabled swap (you may not need to do this for your setup). To disable swap immediately you can use swapoff
, but you’ll also need to edit your /etc/fstab
file to remove or comment out any swap entries. This is required by kubeadm
during its “pre-flight checks”.
Next I installed docker and cloned the VM — I did not do a link clone, I had issues with the system’s product uuid
not changing (this is also required for Kube). I cloned the machine until I had 3 freshly configured Ubuntu boxes. The kubemaster
box has 2 CPUs and 2GB RAM, the workers worker1
and worker2
have 1 GB RAM and 1 CPU. During some of my initial failures, kube-dns
would crash in a loop, and I eventually saw that there was insufficient CPU resources available to schedule it.
For each of these machines, I then:
- Set the hostname accordingly
- Configured the IP address of the host-only network as a static IP
- Rebooted and ensured I could
ssh
to any machine in the group from any machine.
To configure the hostname I modified /etc/hostname
, and to set the IP address of the host-only adapter, I added the following to the bottom of the file /etc/network/interfaces
:
# Host-only network
auto enp0s8
iface enp0s8 inet static
address 192.168.99.20
netmask 255.255.255.0
network 192.168.99.0
broadcast 192.168.99.255
Your interface name may be different, so check with ip addr
or ifconfig -a
to see what your interface names are. Compare the MAC address you see in the OS with the MAC assigned in your VirtualBox host-only network configuration to make sure you’ve got the right one. In my case, enp0s8
is my host-only network and enp0s3
is my regular adapter that can see my host machine’s network, and is configured for DHCP.
I configured my IP addresses and hostnames as follows:
- master — kubemaster — 192.168.99.20
- worker 1 — worker1–192.168.99.21
- worker 2 — worker2–192.168.99.22
After these were all set, I took a snapshot of the master (because I’m accident prone and always prefer having a backup) and thenssh
'd into kubemaster
from my workstation. My real terminal has better copy/paste support than inside VirtualBox’s quirky console. Because Ubuntu doesn’t have an active root account, you’ll need to sudo
most of your kubeadm
commands. Follow the Kubernetes documentation for installing kubeadm on Ubuntu.
On the master, you need to run kubeadm init
. Note that we absolutely have to specify the advertise address of the API server because we have multiple network adapters on the host and we need it to advertise on the host-only network IP. Also, you need to go look now at the apply commands for your CNI plugin of choice, as some of them require you to specify parameters to kubeadm
, like the pod CIDR block:
sudo kubeadm init --api-server-advertise-address=192.168.99.20 --pod-network-cidr=192.168.0.0/16
I knew ahead of time that I was going to run Calico, so I knew that I needed to set the pod network CIDR. My lesson learned here was pick your network plugin before you run kubeadm init. Otherwise, you’ll have to do what I did and go back to a pre-kube snapshot and re-install.
Everything we’ve done up to this point means that my nodes are going to have IP addresses in the range 192.168.99.20–22, and the pods will be allocated IP addresses out of the CIDR 192.168.0.0/16. Understanding the interplay between node IPs and pod IPs took me a while when first learning about Kubernetes.
After kubeadm
finishes, it should tell you everything is good and then it’ll give you a command for kubeadm join
with a token. DO NOT LOSE THIS COMMAND. I stored this in my user’s home directory on the master node as a text file. Since we’re not in production and we’re only building a lab system, this is okay.
Now ssh
into worker1
and install kubeadm. Run the join command you were given as a sudo
command. This command will be surprisingly fast. Within just a few seconds, you should be able to run kubectl get nodes
on the master and see that you now have a master and a worker.
Repeat the process on worker 2. At this point, you should notice that kube-dns
is in a Pending
state and hasn’t been scheduled (check this with kubectl get pods --all-namespaces
). This is because we have no network plugin. How can we successfully discover and communicate with nodes if we have no network plugin? This is because the node network for the cluster is completely independent from the pod network. The node network is already up and running via our kubeadm
commands. I chose Calico for my network plugin:
kubectl apply -f https://docs.projectcalico.org/v2.6/getting-started/kubernetes/installation/hosted/kubeadm/1.6/calico.yaml
After a few seconds (this will feel like an eternity if you don’t know if it’s going to work or not), you should start to see that kube-dns
is being scheduled (it’ll be in ContainerScheduling
state). A few more seconds and it’ll be running. Once it’s running, you should be able to see all your pods like this:
$ kubectl get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system calico-etcd-c2m6m 1/1 Running 1 3d
kube-system calico-kube-controllers-685f8bc7fb-nbvbs 1/1 Running 1 3d
kube-system calico-node-gdrnz 2/2 Running 0 3d
kube-system calico-node-s4x8z 2/2 Running 1 3d
kube-system calico-node-v7b9h 2/2 Running 2 3d
kube-system etcd-kubemaster 1/1 Running 1 3d
kube-system kube-apiserver-kubemaster 1/1 Running 1 3d
kube-system kube-controller-manager-kubemaster 1/1 Running 1 3d
kube-system kube-dns-545bc4bfd4-cbxbw 3/3 Running 0 3d
kube-system kube-proxy-2nxvq 1/1 Running 0 3d
kube-system kube-proxy-8dmzc 1/1 Running 0 3d
kube-system kube-proxy-zz259 1/1 Running 1 3d
kube-system kube-scheduler-kubemaster 1/1 Running 1 3d
And you should have a nice set of VMs running in Virtualbox:
And that’s it! Now you’ve got a running cluster that you can use for lab experiments, learning, and even local development. Hopefully this post makes the experience smoother and less profanity-laced than my first few attempts!