Been a while since I last posted. For basically everything I do Linux-wise, I use Ubuntu. Specifically the Long Term Support versions, which are released on even years in April (April 2020, April 2022, etc.), hence the 22.04 designation of the latest LTS, Jammy. I had standardized on 20.04 (Focal) for everything, but am finding that various packages I use that I expect to be available on 20.04, aren’t. But they are present in standard, default package lists for Jammy 22.04. Last night I was creating a new VM to transfer this very blog to my shiny, new co-located server. I was going to use the Ubuntu 20.04 Proxmox Cloud Init Image Script template I wrote up a bit ago. Then I stopped and realized I should update it. So I changed 2004 everywhere in the script to 2204 and ‘focal’ to ‘jammy’, reran it, and now I have a fancy new cloud init script for Ubuntu 22.04. Note that this script does require you to install libguestfs-tools, as indicated in the previous cloud-init post.
ChatGPT enhanced script as of 2023:
#!/bin/bash
##############################################################################
# things to double-check:
# 1. user directory
# 2. your SSH key location
# 3. which bridge you assign with the create line (currently set to vmbr100)
# 4. which storage is being utilized (script uses local-zfs)
##############################################################################
DISK_IMAGE="jammy-server-cloudimg-amd64.img"
IMAGE_URL="https://cloud-images.ubuntu.com/jammy/current/$DISK_IMAGE"
# Function to check if a file was modified in the last 24 hours or doesn't exist
should_download_image() {
local file="$1"
# If file doesn't exist, return true (i.e., should download)
[[ ! -f "$file" ]] && return 0
local current_time=$(date +%s)
local file_mod_time=$(stat --format="%Y" "$file")
local difference=$(( current_time - file_mod_time ))
# If older than 24 hours, return true
(( difference >= 86400 ))
}
# Download the disk image if it doesn't exist or if it was modified more than 24 hours ago
if should_download_image "$DISK_IMAGE"; then
rm -f "$DISK_IMAGE"
wget -q "$IMAGE_URL"
fi
sudo virt-customize -a "$DISK_IMAGE" --install qemu-guest-agent
sudo virt-customize -a "$DISK_IMAGE" --ssh-inject root:file:/home/austin/id_rsa.pub
if sudo qm list | grep -qw "9022"; then
sudo qm destroy 9022
fi
sudo qm create 9022 --name "ubuntu-2204-cloudinit-template" --memory 2048 --cores 2 --net0 virtio,bridge=vmbr100
sudo qm importdisk 9022 "$DISK_IMAGE" local-zfs
sudo qm set 9022 --scsihw virtio-scsi-pci --scsi0 local-zfs:vm-9022-disk-0
sudo qm set 9022 --boot c --bootdisk scsi0
sudo qm set 9022 --ide2 local-zfs:cloudinit
sudo qm set 9022 --serial0 socket --vga serial0
sudo qm set 9022 --agent enabled=1
sudo qm template 9022
echo "Next up, clone VM, then expand the disk"
echo "You also still need to copy ssh keys to the newly cloned VM"
Original, non-ChatGPT script:
###############
# things to double-check:
# 1. user directory
# 2. your SSH key location
# 3. which bridge you assign with the create line (currently set to vmbr100)
# 4. which storage is being utilized (script uses local-zfs)
###############
rm -f jammy-server-cloudimg-amd64.img
wget -q https://cloud-images.ubuntu.com/jammy/current/jammy-server-cloudimg-amd64.img
sudo virt-customize -a jammy-server-cloudimg-amd64.img --install qemu-guest-agent
sudo virt-customize -a jammy-server-cloudimg-amd64.img --ssh-inject root:file:/home/austin/id_rsa.pub
sudo qm destroy 9022
sudo qm create 9022 --name "ubuntu-2204-cloudinit-template" --memory 2048 --cores 2 --net0 virtio,bridge=vmbr100
sudo qm importdisk 9022 jammy-server-cloudimg-amd64.img local-zfs
sudo qm set 9022 --scsihw virtio-scsi-pci --scsi0 local-zfs:vm-9022-disk-0
sudo qm set 9022 --boot c --bootdisk scsi0
sudo qm set 9022 --ide2 local-zfs:cloudinit
sudo qm set 9022 --serial0 socket --vga serial0
sudo qm set 9022 --agent enabled=1
sudo qm template 9022
rm -f jammy-server-cloudimg-amd64.img
echo "next up, clone VM, then expand the disk"
echo "you also still need to copy ssh keys to the newly cloned VM"
I suppose for a full automation I should get user with a whoami command, assign it to a variable, and use that throughout the script. That’s a task for another day.
I also added a line in my crontab to have this script run weekly, so the image is essentially always up to date. I’ll run both 20.04 and 22.04 in parallel for a bit but anticipate that I can turn off the 20.04 job fairly soon.
# m h dom mon dow command
52 19 * * TUE sleep $(( RANDOM \% 60 )); /usr/bin/bash /home/austin/ubuntu-2004-cloud-init-create.sh >> /home/austin/ubuntu-template.log 2>&1
52 20 * * TUE sleep $(( RANDOM \% 60 )); /usr/bin/bash /home/austin/ubuntu-2204-cloud-init-create.sh >> /home/austin/ubuntu-2204-template.log 2>&1
This post has been a long time coming. I apologize for how long it’s taken. I noticed that many other blogs left off at a similar position as I did. Get the VMs created then…. nothing. Creating a Kubernetes cluster locally is a much cheaper (read: basically free) option to learn how Kubes works compared to a cloud-hosted solution or a full-blown Kubernetes engine/solution, such as AWS Elastic Kubernetes Service (EKS), Azure Kubernetes Service (AKS), or Google Kubernetes Engine (GKE).
Anyways, I finally had some time to complete the tutorial series so here we are. Since the last post, my wife and I are now expecting our 2nd kid, I put up a new solar panel array, built our 1st kid a new bed, messed around with MacOS Monterey on Proxmox, built garden boxes, and a bunch of other stuff. Life happens. So without much more delay let’s jump back in.
Here’s a screenshot of the end state Kubernetes Dashboard showing our nodes:
Current State
If you’ve followed the blog series so far, you should have four VMs in your Proxmox cluster ready to go with SSH keys set, the hard drive expanded, and the right amount of vCPUs and memory allocated. If you don’t have those ready to go, take a step back (Deploying Kubernetes VMs in Proxmox with Terraform) and get caught up. We’re not going to use the storage VM. Some guides I followed had one but I haven’t found a need for it yet so we’ll skip it.
Ansible
What is Ansible
If you ask DuckDuckGo to define ansible, it will tell you the following: “A hypothetical device that enables users to communicate instantaneously across great distances; that is, a faster-than-light communication device.”
We will thus be using Ansible to run the initial Kubernetes set up steps on every machine, initialize the cluster on the master, and join the cluster on the workers/agents.
Initial Ansible Housekeeping
First we need to specify some variables similar to how we did it with Terraform. Create a file in your working directory called ansible-vars.yml and put the following into it:
# specifying a CIDR for our cluster to use.
# can be basically any private range except for ranges already in use.
# apparently it isn't too hard to run out of IPs in a /24, so we're using a /22
pod_cidr: "10.16.0.0/22"
# this defines what the join command filename will be
join_command_location: "join_command.out"
# setting the home directory for retreiving, saving, and executing files
home_dir: "/home/ubuntu"
Equally as important (and potentially a better starting point than the variables) is defining the hosts. In ansible-hosts.txt:
# this is a basic file putting different hosts into categories
# used by ansible to determine which actions to run on which hosts
[all]
10.98.1.41
10.98.1.51
10.98.1.52
[kube_server]
10.98.1.41
[kube_agents]
10.98.1.51
10.98.1.52
[kube_storage]
#10.98.1.61
Checking Ansible can communicated with our hosts
Let’s pause here and make sure Ansible can communicate with our VMs. We will use a simple built-in module named ‘ping’ to do so. The below command broken down:
-i ansible-hosts.txt – use the ansible-hosts.txt file
all – run the command against the [all] block from the ansible-hosts.txt file
-u ubuntu – log in with user ubuntu (since that’s what we set up with the Ubuntu 20.04 Cloud Init template). if you don’t use -u [user], it will use your currently logged in user to attempt to SSH.
-m ping – run the ping module
ansible -i ansible-hosts.txt all -u ubuntu -m ping
If all goes well, you will receive “ping”: “pong” for each of the VMs you have listed in the [all] block of the ansible-hosts.txt file.
Potential SSH errors
If you’ve previously SSH’d to these IPs and have subsequently destroyed/re-created them, you will get scary sounding SSH errors about remote host identification has changed. Run the suggested ssh-keygen -f command for each of the IPs to fix it.
You might also have to SSH into each of the hosts to accept the host key. I’ve done this whole procedure a couple times so I don’t recall what will pop up first attempt.
Then we need a script to install the dependencies and the Kubernetes utilities themselves. This script does quite a few things. Gets apt ready to install things, adding the Docker & Kubernetes signing key, installing Docker and Kubernetes, disabling swap, and adding the ubuntu user to the Docker group.
ansible-install-kubernetes-dependencies.yml:
# https://kubernetes.io/blog/2019/03/15/kubernetes-setup-using-ansible-and-vagrant/
# https://github.com/virtualelephant/vsphere-kubernetes/blob/master/ansible/cilium-install.yml#L57
# ansible .yml files define what tasks/operations to run
---
- hosts: all # run on the "all" hosts category from ansible-hosts.txt
# become means be superuser
become: true
remote_user: ubuntu
tasks:
- name: Install packages that allow apt to be used over HTTPS
apt:
name: "{{ packages }}"
state: present
update_cache: yes
vars:
packages:
- apt-transport-https
- ca-certificates
- curl
- gnupg-agent
- software-properties-common
- name: Add an apt signing key for Docker
apt_key:
url: https://download.docker.com/linux/ubuntu/gpg
state: present
- name: Add apt repository for stable version
apt_repository:
repo: deb [arch=amd64] https://download.docker.com/linux/ubuntu xenial stable
state: present
- name: Install docker and its dependecies
apt:
name: "{{ packages }}"
state: present
update_cache: yes
vars:
packages:
- docker-ce
- docker-ce-cli
- containerd.io
- name: verify docker installed, enabled, and started
service:
name: docker
state: started
enabled: yes
- name: Remove swapfile from /etc/fstab
mount:
name: "{{ item }}"
fstype: swap
state: absent
with_items:
- swap
- none
- name: Disable swap
command: swapoff -a
when: ansible_swaptotal_mb >= 0
- name: Add an apt signing key for Kubernetes
apt_key:
url: https://packages.cloud.google.com/apt/doc/apt-key.gpg
state: present
- name: Adding apt repository for Kubernetes
apt_repository:
repo: deb https://apt.kubernetes.io/ kubernetes-xenial main
state: present
filename: kubernetes.list
- name: Install Kubernetes binaries
apt:
name: "{{ packages }}"
state: present
update_cache: yes
vars:
packages:
# it is usually recommended to specify which version you want to install
- kubelet=1.23.6-00
- kubeadm=1.23.6-00
- kubectl=1.23.6-00
- name: hold kubernetes binary versions (prevent from being updated)
dpkg_selections:
name: "{{ item }}"
selection: hold
loop:
- kubelet
- kubeadm
- kubectl
# this has to do with nodes having different internal/external/mgmt IPs
# {{ node_ip }} comes from vagrant, which I'm not using yet
# - name: Configure node ip -
# lineinfile:
# path: /etc/default/kubelet
# line: KUBELET_EXTRA_ARGS=--node-ip={{ node_ip }}
- name: Restart kubelet
service:
name: kubelet
daemon_reload: yes
state: restarted
- name: add ubuntu user to docker
user:
name: ubuntu
group: docker
- name: reboot to apply swap disable
reboot:
reboot_timeout: 180 #allow 3 minutes for reboot to happen
With our fresh VMs straight outta Terraform, let’s now run the Ansible script to install the dependencies.
Ansible command to run the Kubernetes dependency playbook (pretty straight-forward: the -i is to input the hosts file, then the next argument is the playbook file itself):
It’ll take a bit of time to run (1m26s in my case). If all goes well, you will be presented with a summary screen (called PLAY RECAP) showing some items in green with status ok and some items in orange with status changed. I got 13 ok’s, 10 changed’s, and 1 skipped.
Initialize the Kubernetes cluster on the master
With the dependencies installed, we can now proceed to initialize the Kubernetes cluster itself on the server/master machine. This script sets docker to use systemd cgroups driver (don’t recall what the alternative is at the moment but this was the easiest of the alternatives), initializes the cluster, copies the cluster files to the ubuntu user’s home directory, installs Calico networking plugin, and the standard Kubernetes dashboard.
ansible-init-cluster.yml:
- hosts: kube_server
become: true
remote_user: ubuntu
vars_files:
- ansible-vars.yml
tasks:
- name: set docker to use systemd cgroups driver
copy:
dest: "/etc/docker/daemon.json"
content: |
{
"exec-opts": ["native.cgroupdriver=systemd"]
}
- name: restart docker
service:
name: docker
state: restarted
- name: Initialize Kubernetes cluster
command: "kubeadm init --pod-network-cidr {{ pod_cidr }}"
args:
creates: /etc/kubernetes/admin.conf # skip this task if the file already exists
register: kube_init
- name: show kube init info
debug:
var: kube_init
- name: Create .kube directory in user home
file:
path: "{{ home_dir }}/.kube"
state: directory
owner: 1000
group: 1000
- name: Configure .kube/config files in user home
copy:
src: /etc/kubernetes/admin.conf
dest: "{{ home_dir }}/.kube/config"
remote_src: yes
owner: 1000
group: 1000
- name: restart kubelet for config changes
service:
name: kubelet
state: restarted
- name: get calico networking
get_url:
url: https://projectcalico.docs.tigera.io/manifests/calico.yaml
dest: "{{ home_dir }}/calico.yaml"
- name: apply calico networking
become: no
command: kubectl apply -f "{{ home_dir }}/calico.yaml"
- name: get dashboard
get_url:
url: https://raw.githubusercontent.com/kubernetes/dashboard/v2.5.0/aio/deploy/recommended.yaml
dest: "{{ home_dir }}/dashboard.yaml"
- name: apply dashboard
become: no
command: kubectl apply -f "{{ home_dir }}/dashboard.yaml"
Initializing the cluster took 53s on my machine. One of the first tasks is to download the images which takes the majority of the duration. You should get 13 ok and 10 changed with the init. I had two extra user check tasks because I was fighting some issues with applying the Calico networking.
With the master up and running, we need to retrieve the join command. I chose to save the command locally and read the file in a subsequent Ansible playbook. This could certainly be combined into a single playbook.
ansible-get-join-command.yaml –
- hosts: kube_server
become: false
remote_user: ubuntu
vars_files:
- ansible-vars.yml
tasks:
- name: Extract the join command
become: true
command: "kubeadm token create --print-join-command"
register: join_command
- name: show join command
debug:
var: join_command
- name: Save kubeadm join command for cluster
local_action: copy content={{ join_command.stdout_lines | last | trim }} dest={{ join_command_location }} # defaults to your local cwd/join_command.out
With the two worker nodes/agents joined up to the cluster, you now have a full on Kubernetes cluster up and running! Wait a few minutes, then log into the server and run kubectl get nodes to verify they are present and active (status = Ready):
kubectl get nodes
Kubernetes Dashboard
Everyone likes a dashboard. Kubernetes has a good one for poking/prodding around. It appears to basically be a visual representation of most (all?) of the “get information” types of command you can run with kubectl (kubectl get nodes, get pods, describe stuff, etc.).
The dashboard was installed with the cluster init script but we still need to create a service account and cluster role binding for the dashboard. These steps are from https://github.com/kubernetes/dashboard/blob/master/docs/user/access-control/creating-sample-user.md. NOTE: the docs state it is not recommended to give admin privileges to this service account. I’m still figuring out Kubernetes privileges so I’m going to proceed anyways.
Dashboard user/role creation
On the master machine, create a file called sa.yaml with the following contents:
Apply both, then get the token to be used for logging in. The last command will spit out a long string. Copy it starting at ‘ey’ and ending before the username (ubuntu). In the screenshot I have highlighted which part is the token
At this point, the dashboard has been running for a while. We just can’t get to it yet. There are two distinct steps that need to happen. The first is to create a SSH tunnel between your local machine and a machine in the cluster (we will be using the master). Then, from within that SSH session, we will run kubectl proxy to expose the web services.
SSH command – the master’s IP is 10.98.1.41 in this example:
The above command will open what appears to be a standard SSH session but the tunnel is running as well. Now execute kubectl proxy:
The Kubernetes Dashboard
At this point, you should be able to navigate to the dashboard page from a web browser on your local machine (http://localhost:8001/api/v1/namespaces/kubernetes-dashboard/services/https:kubernetes-dashboard:/proxy/) and you’ll be prompted for a log in. Make sure the token radio button is selected and paste in that long token from earlier. It expires relatively quickly (couple hours I think) so be ready to run the token retrieval command again.
The default view is for the “default” namespace which has nothing in it at this point. Change it to All namespaces for more details:
From here you can see information about everything in the cluster:
Building off my last NTP post (Microsecond accurate NTP with a Raspberry Pi and PPS GPS), which required a $50-60 GPS device and a Raspberry Pi (also $40+), I have successfully tested something much cheaper, that is good enough, especially for initial PPS synchronization. Good enough, in this case, is defined as +/- 10 milliseconds, which can easily be achieved using a basic USB GPS device: GT-U7. Read on for instructions on how to set up the USB GPS as a Stratum 1 NTP time server.
How accurate of time do you really need? The last post showed how to get all devices on a local area network (LAN) within 0.1 milliseconds of “real” time. Do you need you equipment to be that accurate to official atomic clock time (12:03:05.0001)? Didn’t think so. Do you care if every device is on the correct second compared to official/accurate time (12:03:05)? That’s a lot more reasonable. Using a u-blox USB GPS can get you to 0.01 seconds of official. The best part about this? The required USB GPS units are almost always less than $15 and you don’t need a Raspberry Pi.
Overview
This post will show how to add a u-blox USB GPS module to NTP as a driver or chrony (timekeeping daemon) as a reference clock (using GPSd shared memory for both) and verify the accuracy is within +/- 10 milliseconds.
Materials needed
USB u-blox GPS (VK-172 or GT-U7), GT-U7 preferred because it has a micro-USB plug to connect to your computer. It is important to note that both of these are u-blox modules, which has a binary data format as well as a high default baudrate (57600). These two properties allow for quick transmission of each GPS message from GPS to computer.
15-30 minutes
Steps
1 – Update your host machine and install packages
This tutorial is for Linux. I use Ubuntu so we utilize Aptitude (apt) for package management:
In /etc/default/gpsd, change the settings to the following:
# Start the gpsd daemon automatically at boot time
START_DAEMON="true"
# Use USB hotplugging to add new USB devices automatically to the daemon
USBAUTO="true"
# Devices gpsd should collect to at boot time.
# this could also be /dev/ttyUSB0, it is ACM0 on raspberry pi
DEVICES="/dev/ttyACM0"
# -n means start listening to GPS data without a specific listener
GPSD_OPTIONS="-n"
Reboot with sudo reboot.
3a – Chrony configuration (if using NTPd, go to 3b)
I took the default configuration, added my 10.98 servers, and more importantly, added a reference clock (refclock). Link to chrony documentation here. Arguments/parameters of this configuration file:
iburst means send a bunch of synchronization packets upon service start so accurate time can be determined much faster (usually a couple seconds)
maxpoll (and minpoll, which isn’t used in this config) is how many seconds to wait between polls, defined by 2^x where x is the number in the config. maxpoll 6 means don’t wait more than 2^6=64 seconds between polls
refclock is reference clock, and is the USB GPS source we are adding
‘SHM 0’ means shared memory reference 0, which means it is checking with GPSd using shared memory to see what time the GPS is reporting
‘refid NMEA’ means name this reference ‘NMEA’
‘offset 0.000’ means don’t offset this clock source at all. We will change this later
‘precision 1e-3’ means this reference is only accurate to 1e-3 (0.001) seconds, or 1 millisecond
‘poll 3’ means poll this reference every 2^3 = 8 seconds
‘noselect’ means don’t actually use this clock as a source. We will be measuring the delta to other known times to set the offset and make the source selectable.
pi@raspberrypi:~ $ sudo cat /etc/chrony/chrony.conf
# Welcome to the chrony configuration file. See chrony.conf(5) for more
# information about usuable directives.
#pool 2.debian.pool.ntp.org iburst
server 10.98.1.1 iburst maxpoll 6
server 10.98.1.198 iburst maxpoll 6
server 10.98.1.15 iburst
refclock SHM 0 refid NMEA offset 0.000 precision 1e-3 poll 3 noselect
Restart chrony with sudo systemctl restart chrony.
3b – NTP config
Similar to the chrony config, we need to add a reference clock (called a driver in NTP). For NTP, drivers are “servers” that start with an address of 127.127. The next two octets tell what kind of driver it is. The .28 driver is the shared memory driver, same theory as for chrony. For a full list of drivers, see the official NTP docs. To break down the server:
‘server 127.127.28.0’ means use the .28 (SHM) driver
minpoll 4 maxpoll 4 means poll every 2^4=16 seconds
noselect means don’t use this for time. Similar to chrony, we will be measuring the offset to determine this value.
‘fudge 127.127.28.0’ means we are going to change some properties of the listed driver
‘time1 0.000’is the time offset calibration factor, in seconds
‘stratum 2’ means list this source as a stratum 2 source (has to do with how close the source is to “true” time), listing it as 2 means other, higher stratum sources will be selected before this one will (assuming equal time quality)
Running gpsmon shows us general information about the GPS, including time offset. The output looks like the below screenshot. Of importance is the satellite count (on right, more is better, >5 is good enough for time), HDOP (horizontal dilution of precision) is a measure of how well the satellites can determine your position (lower is better, <2 works for basically all navigation purposes), and TOFF (time offset).
In this screenshot the TOFF is 0.081862027, which is 81.8 milliseconds off the host computer’s time. Watch this for a bit – it should hover pretty close to a certain value +/- 10ms. In my case, I’ve noticed that if there are 10 or less satellites locked on, it is around 77ms. If there are 11 or more, it is around 91ms (presumably due to more satellite information that needs to be transmitted).
5 – record statistics for a data-driven offset
If you are looking for a better offset value to put in the configuration file, we can turn on logging from either chrony or NTPd to record source information.
For chrony:
Edit /etc/chrony/chrony.conf and uncomment the line for which kinds of logging to turn on:
# Uncomment the following line to turn logging on.
log tracking measurements statistics
Then restart chrony (sudo systemctl restart chrony) and logs will start writing to /var/log/chrony (this location is defined a couple lines below the log line in chrony.conf):
pi@raspberrypi:~ $ ls /var/log/chrony
measurements.log statistics.log tracking.log
For NTPd (be sure to restart it after making any configuration changes):
austin@prox-3070 ~ % cat /etc/ntp.conf
# Enable this if you want statistics to be logged.
statsdir /var/log/ntpstats/
statistics loopstats peerstats clockstats
filegen loopstats file loopstats type day enable
filegen peerstats file peerstats type day enable
filegen clockstats file clockstats type day enable
Wait a few minutes for some data to record (chrony synchronizes pretty quick compared to NTPd) and check the statistics file, filtered to our NMEA refid:
cat /var/log/chrony/statistics.log | grep NMEA
This spits out the lines that have NMEA present (the ones of interest for our USB GPS). To include the headers to show what each column is we can run
# chrony
cat /var/log/chrony/statistics.log | head -2; cat /var/log/chrony/statistics.log | grep NMEA
# ntp, there is no header info so we can omit that part of the command
cat /var/log/peerstats | grep 127.127.28.0
NTP stats don’t include header information. The column of interest is the one after the 9014 column. The columns are day, seconds past midnight, source, something, estimated offset, something, something, something. We can see the offset for this VK-172 USB GPS is somewhere around 76-77 milliseconds (0.076-0.077 seconds), which we can put in place of the 0.000 for the .28 driver for NTP and remove noselect.
So now we have some data showing the statistics of our NMEA USB GPS NTP source. We can copy and paste this into Excel, run data to columns, and graph the result and/or get the average to set the offset.
This graph is certainly suspicious (sine wave pattern and such) and if I wasn’t writing this blog post, I’d let data collect overnight to determine an offset. Since time is always of the essence, I will just take the average of the ‘est offset’ column (E), which is 7.64E-2, or 0.0763 seconds. Let’s pop this into the chrony.conf file and remove noselect:
Restart chrony again for the config file changes to take effect – sudo systemctl restart chrony.
6 – watch ‘chrony sources’ or ‘ntpq -pn’ to see if the USB GPS gets selected as the main time source
If you aren’t aware, Ubuntu/Debian/most Linux includes a utility to rerun a command every x seconds called watch. We can use this to watch chrony to see how it is interpreting each time source every 1 second:
# for chrony
watch -n 1 chronyc sources
In the above screenshot, we can see that chrony actually has the NMEA source selected as the primary source (denoted with the *). It has the Raspberry Pi PPS NTP GPS ready to takeover as the new primary (denoted with the +). All of the sources match quite closely (from +4749us to – 505us is around 5.2 milliseconds). The source “offset” is in the square brackets ([ and ]).
# for ntp
watch -n 1 ntpq -pn
7- is +/- five millseconds good enough?
For 99% of use cases, yes. You can stop here and your home network will be plenty accurate. If you want additional accuracy, you are in luck. This GPS module also outputs a PPS (pulse per second) signal! We can use this to get within 0.05 millseconds (0.00005 seconds) from official/atomic clock time.
Conclusion
In this post, we got a u-blox USB GPS set up and added it as a reference clock (refclock) to chrony and demonstrated it is clearly within 10 millisecond of official GPS time.
You could write a script to do all this for you! I should probably try this myself…
In the next post, we can add PPS signals from the GPS module to increase our time accuracy by 1000x (into the microsecond range).
A note on why having faster message transmission is better for timing
My current PPS NTP server uses chrony with NMEA messages transmitted over serial and the PPS signal fed into a GPIO pin. GPSd as a rule does minimum configuration of GPS devices. It typically defaults to 9600 baud for serial devices. A typical GPS message looks like this:
$GPGGA, 161229.487, 3723.2475, N, 12158.3416, W, 1, 07, 1.0, 9.0, M, , , , 0000*18
That message is 83 bytes long. At 9600 baud (9600 bits per second), that message takes 69.1 milliseconds to transmit. Each character/byte takes 0.833 milliseconds to transmit. That means that as the message length varies, the jitter will increase. GPS messages do vary in length, sometimes significantly, depending on what is being sent (i.e. the satellite information, $GPGSV sentences, is only transmitted every 5-10 seconds).
I opened gpsmon to get a sample of sentences – I did not notice this until now but it shows how many bytes each sentence is at the front of the sentence:
These sentences range from 83 bytes to 35 bytes, a variation of (83 bytes -35 bytes)*0.833 milliseconds per byte = 39.984 milliseconds.
Compare to the u-blox binary UBX messages which seem to always be 60 bytes and transmitted at 57600 baud, which is 8.33 milliseconds to transmit the entire message.
The variance (jitter) is thus much lower and can be much more accurate as a NTP source. GPSd has no problem leaving u-blox modules at 57600 baud. This is why the USB GPS modules perform much more accurate for timekeeping than NMEA-based devices when using GPSd.
For basically every GPS module/chipset, it is possible to send it commands to enable/disable sentences (as well as increase the serial baud rate). In an ideal world for timekeeping, GPSd would disable every sentence except for time ($GPZDA), and bump up the baud rate to the highest supported level (115200, 230400, etc.). Unfortunately for us, GPSd’s default behavior is to just work with every GPS, which essentially means no configuring the GPS device.
Without further ado, below is the template I used to create my virtual machines. The main LAN network is 10.98.1.0/24, and the Kube internal network (on its own bridge) is 10.17.0.0/24.
This template creates a Kube server, two agents, and a storage server.
Update 2022-04-26: bumped Telmate provider version to 2.9.8 from 2.7.4
This WordPress blog is decently protected from bots/hackers (read more at Securing this WordPress blog from evil hackers!) but I still get a ton of attempts on the site. Wordfence can block requests at the application layer but as I grow in traffic, I want to make sure CPU cycles aren’t wasted. Thus, I want to block some of these bots/hackers from even connecting to the server. I use Ubuntu, and UFW (Uncomplicated FireWall) is included. It’s pretty simple so I’ve stuck with it.
Blocking IPv4 is easy:
sudo ufw insert 1 deny from 1.2.3.4 comment "repeated unwanted hits on sdrforums.com"
The command broken down:
sudo – run as root since firewall modification requires root access
ufw – run the uncomplicated firewall program
insert – add a rule
1 – insert at the top of the rule list (firewalls evaluate rules from the top down – putting a deny after an allow would mean the traffic wouldn’t be blocked)
deny – deny the request
from – from the following IP
1.2.3.4 – IP address
comment – so you can leave a comment to remind yourself why the rule is in place (“unwanted hits on site.com”, “change request #123456”, “incident remedy #44444”, etc.)
Blocking IPv6 with UFW
So I tried the same command format with an IPv6 address and got an error message – “ERROR: Invalid position ‘1’”. I’ve never got that message before. Also, I do realize I need to widen this IPv6 subnet and block a much larger range of IPs but that’s a topic for a different day.
sudo ufw insert 1 deny from 2400:adc5:11f:5600:e468:9464:e881:b1c0 comment "repeated unwanted hits on sdrforums.com"
ERROR: Invalid position '1'
My rule list at the time looked like this:
austin@rn-nyc-01:~$ sudo ufw status numbered
Status: active
To Action From
-- ------ ----
[ 1] Anywhere DENY IN 51.195.216.255 # repeated unwanted hits on sdrforums.com
[ 2] Anywhere DENY IN 51.195.90.229 # repeated unwanted hits on sdrforums.com
[ 3] Anywhere DENY IN 206.217.139.28 # excessive hits to wp-login and xmlrpc
[ 4] 22/tcp ALLOW IN Anywhere
[ 5] 80/tcp ALLOW IN Anywhere
[ 6] 443/tcp ALLOW IN Anywhere
[ 7] 22/tcp (v6) ALLOW IN Anywhere (v6)
[ 8] 80/tcp (v6) ALLOW IN Anywhere (v6)
[ 9] 443/tcp (v6) ALLOW IN Anywhere (v6)
Pretty easy to understand what’s going on here. I have a few IPv4 addresses blocked, but nothing specific to IPv6. A bit of searching later and I learned that the first IPv6 rule needs to come after the last IPv4 rule. So in this case I needed to add the rule to position #7, since that is where the first IPv6 rule current is located:
austin@rn-nyc-dotnet-01:~$ sudo ufw insert 7 deny from 2400:adc5:11f:5600:e468:9464:e881:b1c0 comment "repeated unwanted hits on
sdrforums.com"
Rule inserted (v6)