Categories
homelab Kubernetes Linux proxmox Terraform

How to deploy VMs in Proxmox with Terraform

Background

I’d like to learn Kubernetes and DevOps. A Kubernetes cluster requires at least 3 VMs/bare metal machines. In my last post, I wrote about how to create a Ubuntu cloud-init template for Proxmox. In this post, we’ll take that template and use it to deploy a couple VMs via automation using Terraform. If you don’t have a template, you need one before proceeding.

Overview

  1. Install Terraform
  2. Determine authentication method for Terraform to interact with Proxmox (user/pass vs API keys)
  3. Terraform basic initialization and provider installation
  4. Develop Terraform plan
  5. Terraform plan
  6. Run Terraform plan and watch the VMs appear!

Youtube Video Link

If you prefer video versions to follow along, please head on over to https://youtu.be/UXXIl421W8g for a live-action video of me deploying virtual machines in Proxmox using Terraform and why we’re running each command.

#1 – Install Terraform

curl -fsSL https://apt.releases.hashicorp.com/gpg | sudo apt-key add -
sudo apt-add-repository "deb [arch=$(dpkg --print-architecture)] https://apt.releases.hashicorp.com $(lsb_release -cs) main"
sudo apt update
sudo apt install terraform

#2 – Determine Authentication Method (use API keys)

You have two options here:

  1. Username/password – you can use the existing default root user and root password here to make things easy… or
  2. API keys – this involves setting up a new user, giving that new user the required permissions, and then setting up API keys so that user doesn’t have to type in a password to perform actions

I went with the API key method since it is not desirable to have your root password sitting in Terraform files (even as an environment variable isn’t a great idea). I didn’t really know what I was doing and I basically gave the new user full admin permissions anyways. Should I lock it down? Surely. Do I know what the minimum required permissions are to do so? Nope. If someone in the comments or on Reddit could enlighten me, I’d really appreciate it!

So we need to create a new user. We’ll name it ‘blog_example’. To add a new user go to Datacenter in the left tab, then Permissions -> Users -> Click add, name the user and click add.

screenshot showing how to add a user in proxmox
Adding ‘blog_example’ user to my proxmox datacenter (cluster)

Next, we need to add API tokens. Click API tokens below users in the permissions category and click add. Select the user you just created and give the token an ID, and uncheck privilege separation (which means we want the token to have the same permissions as the user):

Adding a new API token for user ‘blog_example’

When you click Add it will show you the key. Save this key. It will never be displayed again!

Super secret API key secret

Next we need to add a role to the new user. Permissions -> Add -> Path = ‘/’, User is the one you just made, role = ‘PVEVMAdmin’. This gives the user (and associated API token!) rights to all nodes (the / for path) to do VMAdmin activities:

You also need to add permissions to the storage used by the VMs you want to deploy (both from and to), for me this is /storage/local-zfs (might be /storage/local-lvm for you). Add that too in the path section. Use Admin for the role here because the user also needs the ability to allocate space in the datastore (you could use PVEVMAdmin + a datastore role but I haven’t dove into which one yet):

At this point we are done with the permissions:

It is time to turn to Terraform.

3 – Terraform basic information and provider installation

Terraform has three main stages: init, plan, and apply. We will start with describing the plans, which can be thought of a a type of configuration file for what you want to do. Plans are files stored in directories. Make a new directory (terraform-blog), and create two files: main.tf and vars.tf:

cd ~
mkdir terraform-blog && cd terraform-blog
touch main.tf vars.tf

The two files are hopefully reasonably named. The main content will be in main.tf and we will put a few variables in vars.tf. Everything could go in main.tf but it is a good practice to start splitting things out early. I actually don’t have as much in vars.tf as I should but we all gotta start somewhere

Ok so in main.tf let’s add the bare minimum. We need to tell Terraform to use a provider, which is the term they use for the connector to the entity Terraform will be interacting with. Since we are using Proxmox, we need to use a Proxmox provider. This is actually super easy – we just need to specify the name and version and Terraform goes out and grabs it from github and installs it. I used the Telmate Proxmox provider.

main.tf:

terraform {
  required_providers {
    proxmox = {
      source = "telmate/proxmox"
      version = "2.7.4"
    }
  }
}

Save the file. Now we’ll initialize Terraform with our barebones plan (terraform init), which will force it to go out and grab the provider. If all goes well, we will be informed that the provider was installed and that Terraform has been initialized. Terraform is also really nice in that it tells you the next step towards the bottom of the output (“try running ‘terraform plan’ next”).

austin@EARTH:/mnt/c/Users/Austin/terraform-blog$ terraform init

Initializing the backend...

Initializing provider plugins...
- Finding telmate/proxmox versions matching "2.7.4"...
- Installing telmate/proxmox v2.7.4...
- Installed telmate/proxmox v2.7.4 (self-signed, key ID A9EBBE091B35AFCE)

Partner and community providers are signed by their developers.
If you'd like to know more about provider signing, you can read about it here:
https://www.terraform.io/docs/cli/plugins/signing.html

Terraform has created a lock file .terraform.lock.hcl to record the provider
selections it made above. Include this file in your version control repository
so that Terraform can guarantee to make the same selections by default when
you run "terraform init" in the future.

Terraform has been successfully initialized!

You may now begin working with Terraform. Try running "terraform plan" to see
any changes that are required for your infrastructure. All Terraform commands
should now work.

If you ever set or change modules or backend configuration for Terraform,
rerun this command to reinitialize your working directory. If you forget, other
commands will detect it and remind you to do so if necessary.

4 – Develop Terraform plan

Alright with the provider installed, it is time to use it to deploy a VM. We will use the template we created in the last post (How to create a Proxmox Ubuntu cloud-init image). Alter your main.tf file to be the following. I break it down inside the file with comments

terraform {
  required_providers {
    proxmox = {
      source = "telmate/proxmox"
      version = "2.7.4"
    }
  }
}

provider "proxmox" {
  # url is the hostname (FQDN if you have one) for the proxmox host you'd like to connect to to issue the commands. my proxmox host is 'prox-1u'. Add /api2/json at the end for the API
  pm_api_url = "https://prox-1u:8006/api2/json"

  # api token id is in the form of: <username>@pam!<tokenId>
  pm_api_token_id = "blog_example@pam!new_token_id"

  # this is the full secret wrapped in quotes. don't worry, I've already deleted this from my proxmox cluster by the time you read this post
  pm_api_token_secret = "9ec8e608-d834-4ce5-91d2-15dd59f9a8c1"

  # leave tls_insecure set to true unless you have your proxmox SSL certificate situation fully sorted out (if you do, you will know)
  pm_tls_insecure = true
}

# resource is formatted to be "[type]" "[entity_name]" so in this case
# we are looking to create a proxmox_vm_qemu entity named test_server
resource "proxmox_vm_qemu" "test_server" {
  count = 1 # just want 1 for now, set to 0 and apply to destroy VM
  name = "test-vm-${count.index + 1}" #count.index starts at 0, so + 1 means this VM will be named test-vm-1 in proxmox

  # this now reaches out to the vars file. I could've also used this var above in the pm_api_url setting but wanted to spell it out up there. target_node is different than api_url. target_node is which node hosts the template and thus also which node will host the new VM. it can be different than the host you use to communicate with the API. the variable contains the contents "prox-1u"
  target_node = var.proxmox_host

  # another variable with contents "ubuntu-2004-cloudinit-template"
  clone = var.template_name

  # basic VM settings here. agent refers to guest agent
  agent = 1
  os_type = "cloud-init"
  cores = 2
  sockets = 1
  cpu = "host"
  memory = 2048
  scsihw = "virtio-scsi-pci"
  bootdisk = "scsi0"

  disk {
    slot = 0
    # set disk size here. leave it small for testing because expanding the disk takes time.
    size = "10G"
    type = "scsi"
    storage = "local-zfs"
    iothread = 1
  }
  
  # if you want two NICs, just copy this whole network section and duplicate it
  network {
    model = "virtio"
    bridge = "vmbr0"
  }

  # not sure exactly what this is for. presumably something about MAC addresses and ignore network changes during the life of the VM
  lifecycle {
    ignore_changes = [
      network,
    ]
  }
  
  # the ${count.index + 1} thing appends text to the end of the ip address
  # in this case, since we are only adding a single VM, the IP will
  # be 10.98.1.91 since count.index starts at 0. this is how you can create
  # multiple VMs and have an IP assigned to each (.91, .92, .93, etc.)

  ipconfig0 = "ip=10.98.1.9${count.index + 1}/24,gw=10.98.1.1"
  
  # sshkeys set using variables. the variable contains the text of the key.
  sshkeys = <<EOF
  ${var.ssh_key}
  EOF
}

There is a good amount going on in here. Hopefully the embedded comments explain everything. If not, let me know in the comments or on Reddit (u/Nerdy-Austin).

Now for the vars.tf file. This is a bit easier to understand. Just declare a variable, give it a name, and a default value. That’s all I know at this point and it works.

variable "ssh_key" {
  default = "ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDcwZAOfqf6E6p8IkrurF2vR3NccPbMlXFPaFe2+Eh/8QnQCJVTL6PKduXjXynuLziC9cubXIDzQA+4OpFYUV2u0fAkXLOXRIwgEmOrnsGAqJTqIsMC3XwGRhR9M84c4XPAX5sYpOsvZX/qwFE95GAdExCUkS3H39rpmSCnZG9AY4nPsVRlIIDP+/6YSy9KWp2YVYe5bDaMKRtwKSq3EOUhl3Mm8Ykzd35Z0Cysgm2hR2poN+EB7GD67fyi+6ohpdJHVhinHi7cQI4DUp+37nVZG4ofYFL9yRdULlHcFa9MocESvFVlVW0FCvwFKXDty6askpg9yf4FnM0OSbhgqXzD austin@EARTH"
}

variable "proxmox_host" {
	default = "prox-1u"
}

variable "template_name" {
	default = "ubuntu-2004-cloudinit-template"
}

5 – Terraform plan (official term for “what will Terraform do next”)

Now with the .tf files completed, we can run the plan (terraform plan). We defined a count=1 resource, so we would expect Terraform to create a single VM. Let’s have Terraform run through the plan and tell us what it intends to do. It tells us a lot.

austin@EARTH:/mnt/c/Users/Austin/terraform-blog$ terraform plan

Terraform used the selected providers to generate the following execution plan. Resource actions
are indicated with the following symbols:
  + create

Terraform will perform the following actions:

  # proxmox_vm_qemu.test_server[0] will be created
  + resource "proxmox_vm_qemu" "test_server" {
      + additional_wait           = 15
      + agent                     = 1
      + balloon                   = 0
      + bios                      = "seabios"
      + boot                      = "cdn"
      + bootdisk                  = "scsi0"
      + clone                     = "ubuntu-2004-cloudinit-template"
      + clone_wait                = 15
      + cores                     = 2
      + cpu                       = "host"
      + default_ipv4_address      = (known after apply)
      + define_connection_info    = true
      + force_create              = false
      + full_clone                = true
      + guest_agent_ready_timeout = 600
      + hotplug                   = "network,disk,usb"
      + id                        = (known after apply)
      + ipconfig0                 = "ip=10.98.1.91/24,gw=10.98.1.1"
      + kvm                       = true
      + memory                    = 2048
      + name                      = "test-vm-1"
      + nameserver                = (known after apply)
      + numa                      = false
      + onboot                    = true
      + os_type                   = "cloud-init"
      + preprovision              = true
      + reboot_required           = (known after apply)
      + scsihw                    = "virtio-scsi-pci"
      + searchdomain              = (known after apply)
      + sockets                   = 1
      + ssh_host                  = (known after apply)
      + ssh_port                  = (known after apply)
      + sshkeys                   = <<-EOT
              ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDcwZAOfqf6E6p8IkrurF2vR3NccPbMlXFPaFe2+Eh/8QnQCJVTL6PKduXjXynuLziC9cubXIDzQA+4OpFYUV2u0fAkXLOXRIwgEmOrnsGAqJTqIsMC3XwGRhR9M84c4XPAX5sYpOsvZX/qwFE95GAdExCUkS3H39rpmSCnZG9AY4nPsVRlIIDP+/6YSy9KWp2YVYe5bDaMKRtwKSq3EOUhl3Mm8Ykzd35Z0Cysgm2hR2poN+EB7GD67fyi+6ohpdJHVhinHi7cQI4DUp+37nVZG4ofYFL9yRdULlHcFa9MocESvFVlVW0FCvwFKXDty6askpg9yf4FnM0OSbhgqXzD austin@EARTH
        EOT
      + target_node               = "prox-1u"
      + unused_disk               = (known after apply)
      + vcpus                     = 0
      + vlan                      = -1
      + vmid                      = (known after apply)

      + disk {
          + backup       = 0
          + cache        = "none"
          + file         = (known after apply)
          + format       = (known after apply)
          + iothread     = 1
          + mbps         = 0
          + mbps_rd      = 0
          + mbps_rd_max  = 0
          + mbps_wr      = 0
          + mbps_wr_max  = 0
          + media        = (known after apply)
          + replicate    = 0
          + size         = "10G"
          + slot         = 0
          + ssd          = 0
          + storage      = "local-zfs"
          + storage_type = (known after apply)
          + type         = "scsi"
          + volume       = (known after apply)
        }

      + network {
          + bridge    = "vmbr0"
          + firewall  = false
          + link_down = false
          + macaddr   = (known after apply)
          + model     = "virtio"
          + queues    = (known after apply)
          + rate      = (known after apply)
          + tag       = -1
        }
    }

Plan: 1 to add, 0 to change, 0 to destroy.

────────────────────────────────────────────────────────────────────────────────────────────────

Note: You didn't use the -out option to save this plan, so Terraform can't guarantee to take
exactly these actions if you run "terraform apply" now.

You can see the output of the planning phase of Terraform. It is telling us it will create proxmox_vm_qemu.test_server[0] with a list of parameters. You can double-check the IP address here, as well as the rest of the basic settings. At the bottom is the summary – “Plan: 1 to add, 0 to change, 0 to destroy.” Also note that it tells us again what step to run next – “terraform apply”.

6 – Execute the Terraform plan and watch the VMs appear!

With the summary stating what we want, we can now apply the plan (terraform apply). Note that it prompts you to type in ‘yes’ to apply the changes after it determines what the changes are. It typically takes 1m15s +/- 15s for my VMs to get created.

If all goes well, you will be informed that 1 resource was added!

Command and full output:

austin@EARTH:/mnt/c/Users/Austin/terraform-blog$ terraform apply

Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols:
  + create

Terraform will perform the following actions:

  # proxmox_vm_qemu.test_server[0] will be created
  + resource "proxmox_vm_qemu" "test_server" {
      + additional_wait           = 15
      + agent                     = 1
      + balloon                   = 0
      + bios                      = "seabios"
      + boot                      = "cdn"
      + bootdisk                  = "scsi0"
      + clone                     = "ubuntu-2004-cloudinit-template"
      + clone_wait                = 15
      + cores                     = 2
      + cpu                       = "host"
      + default_ipv4_address      = (known after apply)
      + define_connection_info    = true
      + force_create              = false
      + full_clone                = true
      + guest_agent_ready_timeout = 600
      + hotplug                   = "network,disk,usb"
      + id                        = (known after apply)
      + ipconfig0                 = "ip=10.98.1.91/24,gw=10.98.1.1"
      + kvm                       = true
      + memory                    = 2048
      + name                      = "test-vm-1"
      + nameserver                = (known after apply)
      + numa                      = false
      + onboot                    = true
      + os_type                   = "cloud-init"
      + preprovision              = true
      + reboot_required           = (known after apply)
      + scsihw                    = "virtio-scsi-pci"
      + searchdomain              = (known after apply)
      + sockets                   = 1
      + ssh_host                  = (known after apply)
      + ssh_port                  = (known after apply)
      + sshkeys                   = <<-EOT
              ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDcwZAOfqf6E6p8IkrurF2vR3NccPbMlXFPaFe2+Eh/8QnQCJVTL6PKduXjXynuLziC9cubXIDzQA+4OpFYUV2u0fAkXLOXRIwgEmOrnsGAqJTqIsMC3XwGRhR9M84c4XPAX5sYpOsvZX/qwFE95GAdExCUkS3H39rpmSCnZG9AY4nPsVRlIIDP+/6YSy9KWp2YVYe5bDaMKRtwKSq3EOUhl3Mm8Ykzd35Z0Cysgm2hR2poN+EB7GD67fyi+6ohpdJHVhinHi7cQI4DUp+37nVZG4ofYFL9yRdULlHcFa9MocESvFVlVW0FCvwFKXDty6askpg9yf4FnM0OSbhgqXzD austin@EARTH
        EOT
      + target_node               = "prox-1u"
      + unused_disk               = (known after apply)
      + vcpus                     = 0
      + vlan                      = -1
      + vmid                      = (known after apply)

      + disk {
          + backup       = 0
          + cache        = "none"
          + file         = (known after apply)
          + format       = (known after apply)
          + iothread     = 1
          + mbps         = 0
          + mbps_rd      = 0
          + mbps_rd_max  = 0
          + mbps_wr      = 0
          + mbps_wr_max  = 0
          + media        = (known after apply)
          + replicate    = 0
          + size         = "10G"
          + slot         = 0
          + ssd          = 0
          + storage      = "local-zfs"
          + storage_type = (known after apply)
          + type         = "scsi"
          + volume       = (known after apply)
        }

      + network {
          + bridge    = "vmbr0"
          + firewall  = false
          + link_down = false
          + macaddr   = (known after apply)
          + model     = "virtio"
          + queues    = (known after apply)
          + rate      = (known after apply)
          + tag       = -1
        }
    }

Plan: 1 to add, 0 to change, 0 to destroy.

Do you want to perform these actions?
  Terraform will perform the actions described above.
  Only 'yes' will be accepted to approve.

  Enter a value: yes

proxmox_vm_qemu.test_server[0]: Creating...
proxmox_vm_qemu.test_server[0]: Still creating... [10s elapsed]
proxmox_vm_qemu.test_server[0]: Still creating... [20s elapsed]
proxmox_vm_qemu.test_server[0]: Still creating... [30s elapsed]
proxmox_vm_qemu.test_server[0]: Still creating... [40s elapsed]
proxmox_vm_qemu.test_server[0]: Still creating... [50s elapsed]
proxmox_vm_qemu.test_server[0]: Still creating... [1m0s elapsed]
proxmox_vm_qemu.test_server[0]: Creation complete after 1m9s [id=prox-1u/qemu/142]

Apply complete! Resources: 1 added, 0 changed, 0 destroyed.

Now go check Proxmox and see if your VM was created:

Successfully added a virtual machine (VM) to Proxmox with Terraform

Success! You should now be able to SSH into the new VM with the key you already provided (note: the username will be ‘ubuntu’, not whatever you had set in your key).

Last – Removing the test VM

I just set the count to 0 for the resource in the main.tf file and apply and the VM is stopped and destroyed.

Conclusion

This felt like a quick-n-dirty tutorial for how to use Terraform to deploy virtual machines in Proxmox but looking back, there is a decent amount of detail. It took me quite a while to work through permission issues, hostnames being invalid (turns out you can’t have underscores (_) in hostnames, duh, that took an hour to find), assigning roles to users vs the associated API keys, etc. but I’m glad I worked through everything and can pass it along. Check back soon for my next post on using Terraform to deploy a full set of Kubernetes machines to a Proxmox cluster (and thrilling sequel to that post, Using Ansible to bootstrap a Kubernetes Cluster)!

References

Categories
homelab Kubernetes Linux proxmox

How to create a Proxmox Ubuntu cloud-init image

Background for why I wanted to make a Proxmox Ubuntu cloud-init image

I have recently ventured down the path of attempting to learn CI/CD concepts. I have tried docker multiple times and haven’t really enjoyed the nuances any of the times. To me, LXC/LXD containers are far easier to understand than Docker when coming from a ‘one VM per service’ background. LXC/LXD containers can be assigned IP addresses (or get them from DHCP) and otherwise behave basically exactly like a VM from a networking perspective. Docker’s networking model is quite a bit more nuanced. Lots of people say it’s easier, but having everything run on ‘localhost:[high number port]’ doesn’t work well when you’ve got lots of services, unless you do some reverse proxying, like with Traefik or similar. Which is another configuration step.

It is so much easier to just have a LXC get an IP via DHCP and then it’s accessible from hostname right off the bat (I use pfSense for DHCP/DNS – all DHCP leases are entered right into DNS). Regardless, I know Kubernetes is the new hotness so I figured I need to learn it. Every tutorial says you need a master and at least two worker nodes. No sense making three separate virtual machines – let’s use the magic of virtualization and clone some images! I plan on using Terraform to deploy the virtual machines for my Kubernetes cluster (as in, I’ve already used this Proxmox Ubuntu cloud-init image to make my own Kubernetes nodes but haven’t documented it yet).

Overview

The quick summary for this tutorial is:

  1. Download a base Ubuntu cloud image
  2. Install some packages into the image
  3. Create a Proxmox VM using the image
  4. Convert it to a template
  5. Clone the template into a full VM and set some parameters
  6. Automate it so it runs on a regular basis (extra credit)?
  7. ???
  8. Profit!

Youtube Video Link

If you prefer video versions to follow along, please head on over to https://youtu.be/1sPG3mFVafE for a live action video of me creating the Proxmox Ubuntu cloud-init image and why we’re running each command.

#1 – Downloading the base Ubuntu image

Luckily, Ubuntu (my preferred distro, guessing others do the same) provides base images that are updated on a regular basis – https://cloud-images.ubuntu.com/. We are interested in the “current” release of Ubuntu 20.04 Focal, which is the current Long Term Support version. Further, since Proxmox uses KVM, we will be pulling that image:

wget https://cloud-images.ubuntu.com/focal/current/focal-server-cloudimg-amd64.img

#2 – Install packages

It took me quite a while into my Terraform debugging process to determine that qemu-guest-agent wasn’t included in the cloud-init image. Why it isn’t, I have no idea. Luckily there is a very cool tool that I just learned about that enables installing packages directly into a image. The tool is called virt-customize and it comes in the libguestfs-tools package (“libguestfs is a set of tools for accessing and modifying virtual machine (VM) disk images” – https://www.libguestfs.org/).

Install the tools:

sudo apt update -y && sudo apt install libguestfs-tools -y

Then install qemu-guest-agent into the newly downloaded image:

sudo virt-customize -a focal-server-cloudimg-amd64.img --install qemu-guest-agent

At this point you can presumably install whatever else you want into the image but I haven’t tested installing other packages. qemu-guest-agent was what I needed to get the VM recognized by Terraform and accessible.

Update 2021-12-30 – it is possible to inject the SSH keys into the cloud image itself before turning it into a template and VM. You need to create a user first and the necessary folders:

# not quite working yet. skip this and continue
#sudo virt-customize -a focal-server-cloudimg-amd64.img --run-command 'useradd austin'
#sudo virt-customize -a focal-server-cloudimg-amd64.img --run-command 'mkdir -p /home/austin/.ssh'
#sudo virt-customize -a focal-server-cloudimg-amd64.img --ssh-inject austin:file:/home/austin/.ssh/id_rsa.pub
#sudo virt-customize -a focal-server-cloudimg-amd64.img --run-command 'chown -R austin:austin /home/austin'

#3 – Create a Proxmox virtual machine using the newly modified image

The commands here should be relatively self explanatory but in general we are creating a VM (VMID=9000, basically every other resource I saw used this ID so we will too) with basic resources (2 cores, 2048MB), assigning networking to a virtio adapter on vmbr0, importing the image to storage (your storage here will be different if you’re not using ZFS, probably either ‘local’ or ‘local-lvm’), setting disk 0 to use the image, setting boot drive to disk, setting the cloud init stuff to ide2 (which is apparently appears as a CD-ROM to the VM, at least upon inital boot), and adding a virtual serial port. I had only used qm to force stop VMs before this but it’s pretty useful.

sudo qm create 9000 --name "ubuntu-2004-cloudinit-template" --memory 2048 --cores 2 --net0 virtio,bridge=vmbr0
sudo qm importdisk 9000 focal-server-cloudimg-amd64.img local-zfs
sudo qm set 9000 --scsihw virtio-scsi-pci --scsi0 local-zfs:vm-9000-disk-0
sudo qm set 9000 --boot c --bootdisk scsi0
sudo qm set 9000 --ide2 local-zfs:cloudinit
sudo qm set 9000 --serial0 socket --vga serial0
sudo qm set 9000 --agent enabled=1

You can start the VM up at this point if you’d like and make any other changes you want because the next step is converting it to a template. If you do boot it, I will be completely honest I have no idea how to log into it. I actually just googled this because I don’t want to leave you without an answer – looks like you can use the same virt-customize we used before to set a root password according to stackoverflow (https://stackoverflow.com/questions/29137679/login-credentials-of-ubuntu-cloud-server-image). Not going to put that into a command window here because cloud-init is really meant for public/private key authentication (see post here for a quick SSH tutorial).

#4 – Convert VM to a template

Ok if you made any changes, shut down the VM. If you didn’t boot the VM, that’s perfectly fine also. We need to convert it to a template:

sudo qm template 9000

And now we have a functioning template!

screenshot of proxmox ui showing ubuntu 20.04 cloud-init template

#5 – Clone the template into a full VM and set some parameters

From this point you can clone the template as much as you want. But, each time you do so it makes sense to set some parameters, namely the SSH keys present in the VM as well as the IP address for the main interface. You could also add the SSH keys with virt-customize but I like doing it here.

First, clone the VM (here we are cloning the template with ID 9000 to a new VM with ID 999):

sudo qm clone 9000 999 --name test-clone-cloud-init

Next, set the SSH keys and IP address:

sudo qm set 999 --sshkey ~/.ssh/id_rsa.pub
sudo qm set 999 --ipconfig0 ip=10.98.1.96/24,gw=10.98.1.1

It’s now ready to start up!

sudo qm start 999

You should be able to log in without any problems (after trusting the SSH fingerprint). Note that the username is ‘ubuntu’, not whatever the username is for the key you provided.

ssh [email protected]

Once you’re happy with how things worked, you can stop the VM and clean up the resources:

sudo qm stop 999 && sudo qm destroy 999
rm focal-server-cloudimg-amd64.img

#6 – automating the process

I have not done so yet, but if you create VMs on a somewhat regular basis, it wouldn’t be hard to stick all of the above into a simple shell script (update 2022-04-19: simple shell script below) and run it via cron on a weekly basis or whatever frequency you prefer. I can’t tell you how many times I make a new VM from whatever .iso I downloaded and the first task is apt upgrade taking forever to run (‘sudo apt update’ –> “176 packages can be upgraded”). Having a nice template always ready to go would solve that issue and would frankly save me a ton of time.

#6.5 – Shell script to create template

# installing libguestfs-tools only required once, prior to first run
sudo apt update -y
sudo apt install libguestfs-tools -y

# remove existing image in case last execution did not complete successfully
rm focal-server-cloudimg-amd64.img
wget https://cloud-images.ubuntu.com/focal/current/focal-server-cloudimg-amd64.img
sudo virt-customize -a focal-server-cloudimg-amd64.img --install qemu-guest-agent
sudo qm create 9000 --name "ubuntu-2004-cloudinit-template" --memory 2048 --cores 2 --net0 virtio,bridge=vmbr0
sudo qm importdisk 9000 focal-server-cloudimg-amd64.img local-zfs
sudo qm set 9000 --scsihw virtio-scsi-pci --scsi0 local-zfs:vm-9000-disk-0
sudo qm set 9000 --boot c --bootdisk scsi0
sudo qm set 9000 --ide2 local-zfs:cloudinit
sudo qm set 9000 --serial0 socket --vga serial0
sudo qm set 9000 --agent enabled=1
sudo qm template 9000
rm focal-server-cloudimg-amd64.img
echo "next up, clone VM, then expand the disk"
echo "you also still need to copy ssh keys to the newly cloned VM"

#7-8 – Using this template with Terraform to automate VM creation

Next post – How to deploy VMs in Proxmox with Terraform

References

https://matthewkalnins.com/posts/home-lab-setup-part-1-proxmox-cloud-init/
https://registry.terraform.io/modules/sdhibit/cloud-init-vm/proxmox/latest/examples/ubuntu_single_vm

My original notes

https://matthewkalnins.com/posts/home-lab-setup-part-1-proxmox-cloud-init/
https://registry.terraform.io/modules/sdhibit/cloud-init-vm/proxmox/latest/examples/ubuntu_single_vm

# create cloud image VM
wget https://cloud-images.ubuntu.com/focal/20210824/focal-server-cloudimg-amd64.img
sudo qm create 9000 --name "ubuntu-2004-cloudinit-template" --memory 2048 --cores 2 --net0 virtio,bridge=vmbr0

# to install qemu-guest-agent or whatever into the guest image
#sudo apt-get install libguestfs-tools
#virt-customize -a focal-server-cloudimg-amd64.img --install qemu-guest-agent
sudo qm importdisk 9000 focal-server-cloudimg-amd64.img local-zfs
sudo qm set 9000 --scsihw virtio-scsi-pci --scsi0 local-zfs:vm-9000-disk-0
sudo qm set 9000 --boot c --bootdisk scsi0
sudo qm set 9000 --ide2 local-zfs:cloudinit
sudo qm set 9000 --serial0 socket --vga serial0
sudo qm template 9000

# clone cloud image to new VM
sudo qm clone 9000 999 --name test-clone-cloud-init
sudo qm set 999 --sshkey ~/.ssh/id_rsa.pub
sudo qm set 999 --ipconfig0 ip=10.98.1.96/24,gw=10.98.1.1
sudo qm start 999
  
# remove known host because SSH key changed
ssh-keygen -f "/home/austin/.ssh/known_hosts" -R "10.98.1.96"

# ssh in
ssh -i ~/.ssh/id_rsa [email protected]

# stop and destroy VM
sudo qm stop 999 && sudo qm destroy 999
Categories
Linux Raspberry Pi

Microsecond accurate NTP with a Raspberry Pi and PPS GPS

Introduction

Lots of acronyms in that title. If I expand them out, it says – “microsecond accurate network time protocol with a Raspberry Pi and pulse per second global positioning system [receiver]”. What it means is you can get super accurate timekeeping (1 microsecond = 0.000001 seconds) with a Raspberry Pi and a GPS receiver (the GT-U7 is less than $12) that spits out pulses every second. By following this guide, you will your very own Stratum 1 NTP server at home!

Why would you need time this accurate at home?

You don’t. There aren’t many applications for this level of timekeeping in general, and even fewer at home. But this blog is called Austin’s Nerdy Things so here we are. Using standard, default internet NTP these days will get your computers to within a 10-20 milliseconds of actual time (1 millisecond = 0.001 seconds). By default, Windows computers get time from time.windows.com. MacOS computers get time from time.apple.com. Linux devices get time from [entity].pool.ntp.org, like debian.pool.ntp.org. PPS gets you to the next SI prefix in terms of accuracy (milli -> micro), which means 1000x more accurate timekeeping.

YouTube video link

If you prefer a video version – https://www.youtube.com/watch?v=YfgX7qPeiqQ

Materials needed

Steps

0 – Update your Pi and install packages

This NTP guide assumes you have a Raspberry Pi ready to go. If you don’t, I have a 16 minute video on YouTube that goes through flashing the SD card and initial setup – https://youtu.be/u5dHvEYwr9M.

You should update your Pi to latest before basically any project. We will install some other packages as well. pps-tools help us check that the Pi is receiving PPS signals from the GPS module. We also need GPSd for the GPS decoding of both time and position. I use chrony instead of NTPd because it seems to sync faster than NTPd in most instances and also handles PPS without compiling from source (the default Raspbian NTP doesn’t do PPS) Installing chrony will remove ntpd.

sudo apt update
sudo apt upgrade
sudo rpi-update
sudo apt install pps-tools gpsd gpsd-clients python-gps chrony

1 – add GPIO and module info where needed

In /boot/config.txt, add ‘dtoverlay=pps-gpio,gpiopin=18’ to a new line. This is necessary for PPS. If you want to get the NMEA data from the serial line, you must also enable UART and set the initial baud rate.

sudo bash -c "echo '# the next 3 lines are for GPS PPS signals' >> /boot/config.txt"
sudo bash -c "echo 'dtoverlay=pps-gpio,gpiopin=18' >> /boot/config.txt"
sudo bash -c "echo 'enable_uart=1' >> /boot/config.txt"
sudo bash -c "echo 'init_uart_baud=9600' >> /boot/config.txt"

In /etc/modules, add ‘pps-gpio’ to a new line.

sudo bash -c "echo 'pps-gpio' >> /etc/modules"

Reboot

sudo reboot

2 – wire up the GPS module to the Pi

I used the Adafruit Ultimate GPS breakout. It has 9 pins but we are only interested in 5. There is also the Adafruit GPS hat which fits right on the Pi but that seems expensive for what it does (but it is significantly neater in terms of wiring).

Pin connections:

  1. GPS PPS to RPi pin 12 (GPIO 18)
  2. GPS VIN to RPi pin 2 or 4
  3. GPS GND to RPi pin 6
  4. GPS RX to RPi pin 8
  5. GPS TX to RPi pin 10
  6. see 2nd picture for a visual
Adafruit Ultimate GPS Breakout V3
We use 5 wires total. GPS PPS to pin 12 (GPIO 18), GPS VIN to pin 2 or 4, GPS GND to pin 6, GPS RX to pin 8, GPS TX to pin 10.
GPS with wires attached to the board (but not to the Pi) and the antenna. The antenna has a SMA connector, and there is another adapter that is SMA to u.fl to plug into the GPS board.

Now place your GPS antenna (if you have one) in a spot with a good view of the sky. If you don’t have an antenna, you might have to get a bit creative with how you locate your Raspberry Pi with attached GPS.

I honestly have my antenna in the basement (with only the kitchen and attic above) and I generally have 8-10 satellites locked all the time (11 as of writing). Guess that means the antenna works better than expected! Either way, better exposure to the sky will in theory work better. Pic:

My super awesome placement of the GPS antenna on top of the wood “cage” (?) I built to hold my 3d printer on top of my server rack. I guess this is a testimony for how well the GPS antenna works? It has 11 satellites locked on as of writing, with a HDOP of 0.88. The components in this rack (Brocade ICX6450-48P, 1U white box with Xeon 2678v3/96GB memory/2x480GB enterprise SSDs/4x4TB HDDs, Dell R710 with 4x4TB and other stuff) will be detailed in an upcoming post.

2.5 – free up the UART serial port for the GPS device

Run raspi-config -> 3 – Interface options -> I6 – Serial Port -> Would you like a login shell to be available over serial -> No. -> Would you like the serial port hardware to be enabled -> Yes.

Finish. Yes to reboot.

3 – check that PPS is working

First, check that the PPS module is loaded:

lsmod | grep pps

The output should be like:

pi@raspberrypi:~ $ lsmod | grep pps
pps_gpio               16384  0
pps_core               16384  1 pps_gpio

Second, check for the PPS pulses:

sudo ppstest /dev/pps0

The output should spit out a new line every second that looks something like this (your output will be a bit farther from x.000000 since it isn’t yet using the GPS PPS):

pi@pi-ntp:~ $ sudo ppstest /dev/pps0
trying PPS source "/dev/pps0"
found PPS source "/dev/pps0"
ok, found 1 source(s), now start fetching data...
source 0 - assert 1618799061.999999504, sequence: 882184 - clear  0.000000000, sequence: 0
source 0 - assert 1618799062.999999305, sequence: 882185 - clear  0.000000000, sequence: 0
source 0 - assert 1618799063.999997231, sequence: 882186 - clear  0.000000000, sequence: 0
source 0 - assert 1618799064.999996827, sequence: 882187 - clear  0.000000000, sequence: 0
source 0 - assert 1618799065.999995852, sequence: 882188 - clear  0.000000000, sequence: 0
^C

4 – change GPSd to start immediately upon boot

Edit /etc/default/gpsd and change GPSD_OPTIONS=”” to GPSD_OPTIONS=”-n” and change DEVICES=”” to DEVICES=”/dev/ttyS0 /dev/pps0″, then reboot. My full /etc/default/gpsd is below:

pi@raspberrypi:~ $ sudo cat /etc/default/gpsd
# Default settings for the gpsd init script and the hotplug wrapper.

# Start the gpsd daemon automatically at boot time
START_DAEMON="true"

# Use USB hotplugging to add new USB devices automatically to the daemon
USBAUTO="true"

# Devices gpsd should collect to at boot time.
# They need to be read/writeable, either by user gpsd or the group dialout.
DEVICES="/dev/ttyS0 /dev/pps0"

# Other options you want to pass to gpsd
GPSD_OPTIONS="-n"
sudo reboot

4.5 – check GPS for good measure

To ensure your GPS has a valid position, you can run gpsmon or cgps to check satellites and such. This check also ensures GPSd is functioning as expected. If your GPS doesn’t have a position solution, you won’t get a good time signal. If GPSd isn’t working, you won’t get any updates on the screen. The top portion will show the analyzed GPS data and the bottom portion will scroll by with the raw GPS sentences from the GPS module.

gpsmon showing 10 satellites used for the position with HDOP of 0.88. This indicates a good position solution which means the time signals should be good as well. The PPS of 0.000000684 indicates the Raspberry Pi is only 684 nanoseconds off of GPS satellite time.

5 – edit config files

For chrony, add these two lines to the /etc/chrony/chrony.conf file somewhere near the rest of the server lines:

refclock SHM 0 refid NMEA offset 0.200
refclock PPS /dev/pps0 refid PPS lock NMEA

My entire /etc/chrony/chrony.conf file looks like this:

###### below this line are custom config changes #######
server 10.98.1.1 iburst minpoll 3 maxpoll 5
server time-a-b.nist.gov iburst
server time-d-b.nist.gov
server utcnist.colorado.edu
server time.windows.com
server time.apple.com

# delay determined experimentally by setting noselect then monitoring for a few hours
# 0.325 means the NMEA time sentence arrives 325 milliseconds after the PPS pulse
# the delay adjusts it forward
refclock SHM 0 delay 0.325 refid NMEA
refclock PPS /dev/pps0 refid PPS

allow 10.98.1.0/24 # my home network
###### above this line are custom config changes #######

###### below this line is standard chrony stuff #######
keyfile /etc/chrony/chrony.keys
driftfile /var/lib/chrony/chrony.drift
#log tracking measurements statistics
logdir /var/log/chrony
maxupdateskew 100.0
hwclockfile /etc/adjtime
rtcsync
makestep 1 3

Restart chrony, wait a few minutes, and verify.

sudo systemctl restart chrony

5 – verify the NTP server is using the GPS PPS

Right after a chrony restart, the sources will look like this (shown by running ‘chronyc sources’)

pi@pi-ntp:~ $ chronyc sources
210 Number of sources = 9
MS Name/IP address         Stratum Poll Reach LastRx Last sample
===============================================================================
#? NMEA                          0   4     0     -     +0ns[   +0ns] +/-    0ns
#? PPS                           0   4     0     -     +0ns[   +0ns] +/-    0ns
^? pfsense.home.fluffnet.net     0   3     3     -     +0ns[   +0ns] +/-    0ns
^? time-a-b.nist.gov             1   6     3     1  -2615us[-2615us] +/- 8218us
^? time-d-b.nist.gov             1   6     1     3  -2495us[-2495us] +/- 7943us
^? india.colorado.edu            0   6     0     -     +0ns[   +0ns] +/-    0ns
^? 13.86.101.172                 3   6     1     4  -4866us[-4866us] +/-   43ms
^? usdal4-ntp-002.aaplimg.c>     1   6     1     4  -2143us[-2143us] +/-   13ms
^? time.cloudflare.com           3   6     1     3  -3747us[-3747us] +/- 9088us

The # means locally connected source of time. The question marks mean it is still determine the status of each source.

After a couple minutes, you can check again:

pi@pi-ntp:~ $ chronyc -n sources
210 Number of sources = 9
MS Name/IP address         Stratum Poll Reach LastRx Last sample
===============================================================================
#x NMEA                          0   4   377    23    -37ms[  -37ms] +/- 1638us
#* PPS                           0   4   377    25   -175ns[ -289ns] +/-  126ns
^? 10.98.1.1                     0   5   377     -     +0ns[   +0ns] +/-    0ns
^- 132.163.96.1                  1   6   177    22  -3046us[-3046us] +/- 8233us
^- 2610:20:6f96:96::4            1   6    17    28  -2524us[-2524us] +/- 7677us
^? 128.138.140.44                1   6     3    30  -3107us[-3107us] +/- 8460us
^- 13.86.101.172                 3   6    17    28  -8233us[-8233us] +/-   47ms
^- 17.253.2.253                  1   6    17    29  -3048us[-3048us] +/-   14ms
^- 2606:4700:f1::123             3   6    17    29  -3325us[-3325us] +/- 8488us

For the S column, * means the source that is active. + means it is considered a good source and would be used if the current one is determined to be bad or is unavailable. The x shown for the NMEA source means it is a “false ticker”, which means it isn’t being used. In our case that is fine because the PPS source is active and valid. Anything else generally means it won’t be used.

In this case, chrony is using the PPS signal. The value inside the brackets is how far off chrony is from the source. It is showing that we are 289 nanoseconds off of GPS PPS time. This is very, very close to atomic clock level accuracy. The last column (after the +/-) includes latency to the NTP source as well as how far out of sync chrony thinks it is (for example, the 17.253.2.253 server is 12.5 milliseconds away one-way via ping):

pi@pi-ntp:~ $ ping -c 5 17.253.2.253
PING 17.253.2.253 (17.253.2.253) 56(84) bytes of data.
64 bytes from 17.253.2.253: icmp_seq=1 ttl=54 time=25.2 ms
64 bytes from 17.253.2.253: icmp_seq=2 ttl=54 time=27.7 ms
64 bytes from 17.253.2.253: icmp_seq=3 ttl=54 time=23.8 ms
64 bytes from 17.253.2.253: icmp_seq=4 ttl=54 time=24.4 ms
64 bytes from 17.253.2.253: icmp_seq=5 ttl=54 time=23.4 ms

--- 17.253.2.253 ping statistics ---
5 packets transmitted, 5 received, 0% packet loss, time 4007ms
rtt min/avg/max/mdev = 23.403/24.954/27.780/1.547 ms

For a full list of information about how to interpret these results, check here – https://docs.fedoraproject.org/en-US/Fedora/18/html/System_Administrators_Guide/sect-Checking_if_chrony_is_synchronized.html

6 – results

A day after the bulk of writing this post, I turned on logging via the .conf file and restarted chrony. It settled to microsecond level accuracy in 57 seconds. For the offset column, the scientific notation e-7 means the results are in nanoseconds. This means that for the 19:54:35 line, the clock is -450.9 nanoseconds off of the PPS signal. There is that e-10 line in there which says that it is 831 picoseconds off PPS (I had to look up SI prefixes to make sure pico came after nano! also I doubt the Pi can actually keep track of time that closely.). After the initial sync, there is only 1 line in the below log that is actually at the microsecond level (the others are all better than microsecond) – the 20:00:59 entry, which shows the clock is -1.183 microseconds off.

Things that affect timekeeping on the Pi

Thought I’d toss in this section for completeness (i.e. thanks for all the good info Austin but how can I make this even better?). There are a few things that affect how well the time is kept on the Pi:

  • Ambient temperature around the Pi – if you plot the freq PPM against ambient temperature, there will be a clear trend. The more stable the ambient temp, the less variation in timekeeping.
  • Load on the Pi – similar to above. Highly variable loads will make the processor work harder and easier. Harder working processor means more heat. More heat means more variability. These crystals are physical devices after all.
  • GPS reception – they actually making timing GPS chips that prefer satellites directly overhead. They have better ability to filter out multipathing and such. In general, the better the GPS reception, the better the PPS signal.

Conclusion

After running through the steps in this guide, you should now have a functional Stratum 1 NTP server running on your Raspberry Pi with microsecond level accuracy provided by a PPS GPS. This system can obtain time in the absence of any external sources other than the GPS (for example, internet time servers), and then sync up with the extremely precise GPS PPS signal. Our NTP GPS PPS clock is now within a few microseconds of actual GPS time.

Update 2024-01-19: RIP Dave Mills, inventor/creator of NTP – https://arstechnica.com/gadgets/2024/01/inventor-of-ntp-protocol-that-keeps-time-on-billions-of-devices-dies-at-age-85/

References

I read a ton on https://www.satsignal.eu/ntp/Raspberry-Pi-NTP.html and other pages on that domain over the years which has been extremely helpful in getting GPS PPS NTP going for my setup. There is a lot of background info for how/why this stuff works and many useful graphics. Another source is https://wellnowweknow.com/index.php/2019/12/27/how-to-ntp-a-raspberry-pi-4-via-gps-and-pps/ for a short and sweet (maybe too short) set of commands.

Categories
homelab Linux proxmox

Proxmox Cluster manual update

Recently ran into an issue where I added a node to my Proxmox cluster while a node was disconnected & off. That node (prox) caused the others to become unresponsive for a number of Proxmox things (it was missing from the cluster) upon boot.

Set node to local mode

The solution was to put the node that had been offline (called prox) into “local” mode. Thanks to Nicholas of Technicus / techblog.jeppson.org for the commands to do so:

sudo systemctl stop pve-cluster
sudo /usr/bin/pmxcfs -l

This allows editing of the all-important /etc/pve/corosync.conf file.

Manually update corosync.conf

I basically just had to copy over the config present on the two synchronized nodes to node prox, then reboot. This allowed node prox to join the cluster again and things started working fine.

Problem corosync.conf on node prox:

logging {
  debug: off
  to_syslog: yes
}

nodelist {
  node {
    name: prox
    nodeid: 1
    quorum_votes: 1
    ring0_addr: 10.98.1.14
  }
  node {
    name: prox-1u
    nodeid: 2
    quorum_votes: 3
    ring0_addr: 10.98.1.15
  }
}

quorum {
  provider: corosync_votequorum
}

totem {
  cluster_name: prox-cluster
  config_version: 4
  interface {
    linknumber: 0
  }
  ip_version: ipv4-6
  secauth: on
  version: 2
}

Fancy new corosync.conf on nodes prox-1u and prox-m92p:

logging {
  debug: off
  to_syslog: yes
}

nodelist {
  node {
    name: prox
    nodeid: 1
    quorum_votes: 1
    ring0_addr: 10.98.1.14
  }
  node {
    name: prox-1u
    nodeid: 2
    quorum_votes: 3
    ring0_addr: 10.98.1.15
  }
  node {
    name: prox-m92p
    nodeid: 3
    quorum_votes: 1
    ring0_addr: 10.98.1.92
  }
}

quorum {
  provider: corosync_votequorum
}

totem {
  cluster_name: prox-cluster
  config_version: 5
  interface {
    linknumber: 0
  }
  ip_version: ipv4-6
  secauth: on
  version: 2
}

The difference is that third node item as well as incrementing the config_version from 4 to 5. After I made those changes on node prox and rebooted, things worked fine.

 

 

 

Categories
Linux

SSH Key Tutorial

SSH Key Tutorial

This post will be a basic SSH key tutorial. In my Securing this WordPress blog from evil hackers! post, I recommended turning off password based SSH authentication and moving exclusively to SSH key based authentication. This post will go over the steps to set up SSH key authentication. The outline is:

  1. Generate SSH public/private key pair
  2. Transfer the public key to the host in question
    1. manual method (Windows)
    2. automatic method (Linux/Mac)
  3. Verify the SSH key authentication is functional.

At the end of this post is my user creation script which does all of this automatically.

I do go over this in a good amount of detail in my first ever YouTube video which you can view here if you’d prefer a video instead of text.

I will be generating keys on my new laptop and transferring them to the Raspberry Pi I set up in my second YouTube video (Austin’s Nerdy Things Ep 2 – Setting up a Raspberry Pi) – https://youtu.be/u5dHvEYwr9M.

Step 1 – Generate the SSH key pair

The first thing we need to do is generate the SSH key pair. On all major OSes these days, this command is included (Windows finally joined the other “real” OSes in 2018 with full SSH support out of the box). We will be using ED25519, which is advised for basically anything build/updated since 2014.

ssh-keygen -t ed25519

This will start the generation process. I hit enter when it prompts for the location (it defaults to C:/Users/<user>/.ssh/id_<type> which is fine). I also hit enter again at the passphrase prompt twice because I don’t want to use a passphrase. Using a passphrase increase security but it can’t be used for any automated processes because it’ll prompt every time for the passphrase. This is the full output from the process:

C:\Users\Austin>ssh-keygen -t ed25519
Generating public/private ed25519 key pair.
Enter file in which to save the key (C:\Users\Austin/.ssh/id_ed25519): [press enter here]
Enter passphrase (empty for no passphrase): [press enter here]
Enter same passphrase again: [press enter here]
Your identification has been saved in C:\Users\Austin/.ssh/id_ed25519.
Your public key has been saved in C:\Users\Austin/.ssh/id_ed25519.pub.
The key fingerprint is:
SHA256:t0FIIk<snip.......................snip>6Rx4 austin@DESKTOP-3IPSAN3
The key's randomart image is:
+--[ED25519 256]--+
|o++++++ .        |
|*  + * o .       |
|o.o o   . .      |
| . .     .       |
|..... o S o      |
| . + .  .o.o..   |
|  = . o.+.+=o..  |
|   +.o.  +a..o   |
|  ..ooo o+...    |
+----[SHA256]-----+

Step 2 – Transfer the SSH public key

For Windows, they included all the useful SSH utilities except one: ssh-copy-id. This is the easy way to transfer SSH keys. It is possible to transfer the key without using this utility. For this SSH key tutorial, I will show both.

Step 2a – Transfer SSH key (Windows)

First we need to get the contents of the SSH public key. This will be done with the little-known type command of Windows. In Step 1, the file was placed at C:\Users\Austin\.ssh\id_ed25519 so that’s where we will read:

C:\Users\Austin>type C:\Users\Austin\.ssh\id_ed25519.pub
ssh-ed25519 AAAAC3NzaC1lZD2PT/VIK3KyYRliA0jgqmn7yzkVjuDiFK67Cio austin@DESKTOP-3IPSAN3

The contents of the file are the single line of the SSH public key (starting with the type of key “ssh-ed25519” and then continuing with the key contents “AAAAC3” and ending with the username and host the key was generated on “austin@DESKTOP-3IPSAN3”). This is the data we need to transfer to the new host.

I will first log onto the Raspberry Pi with the username/password combo from the default installation:

ssh pi@raspberrypi

I am now logged into the Pi:

Raspberry Pi Log In Prompt
Logged into the Raspberry Pi

Now on the target machine we need to create the user account and set up some initial things. We use sudo on all of these since root level permissions are required.

# create user austin with home directory (-m means create the home directory)
sudo useradd -m austin

# set the password for user austin
sudo passwd austin

# add user austin to sudo group
sudo usermod -aG sudo austin

# make the .ssh directory for user austin
sudo mkdir /home/austin/.ssh

sudo useradd
User created, password set, sudo added, and directory created

With those four commands run, the user is created, has a password, can run sudo, and has an empty .ssh directory ready for the keys.

Now we need to create the authorized_keys file:

sudo nano /home/austin/.ssh/authorized_keys

In this file, paste (by right-clicking) the single line from the id_ed25519 file:

nano authorized_keys
line pasted and ready to save

To save the file, press CTRL+O and a prompt will appear. Press enter again to save. Now to exit, press CTRL+X. There should be no “are you sure” prompt because the contents are saved.

Lastly, we need to change the owner of the .ssh directory and all of it’s contents to the new user (sudo means they were created as root):

sudo chown -R austin:austin /home/austin/.ssh

Now we can test. From the machine that generated the SSH key, SSH to the target with the new user specified:

ssh austin@raspberrypi

You should get in without having to type a password!

logged in with SSH key
Successfully logged in without typing a password (the SSH key worked)

Step 2b – Transfer SSH key (Linux)

This is much easier. The ssh-copy-id utility does what we just did in a single command.

# make sure you've already generated keys on your Linux/Mac machine
ssh-copy-id austin@raspberrypi

linux ssh-copy-id command
ssh-copy-id copied the keys and did everything for us in a single command

Microsoft – will you please add ssh-copy-id to default Windows installations?

Step 3 – Verify

We did this at the end of each of the other methods.

User creation script

This is the script I use on any new Linux virtual machine or container I spin up. It updates packages, installs a few things, sets the timezone, does all the user creation stuff and sets the SSH key. Lastly, it changes the shell to bash. Feel free to modify this for your own use. It is possible to copy the data from /etc/shadow so you don’t have to type in the password. I haven’t got that far yet.

apt update
apt upgrade -y 
apt install -y net-tools sudo htop sysstat git curl wget zsh
timedatectl set-timezone America/Denver
useradd -m austin
passwd austin
adduser austin sudo
usermod -aG sudo austin
usermod -aG adm austin
groups austin
mkdir /home/austin/.ssh
touch /home/austin/.ssh/authorized_keys
echo 'ssh-rsa AAAAB  <snip>  bhgqXzD austin@EARTH' >> /home/austin/.ssh/authorized_keys
chown -R austin:austin /home/austin/.ssh
usermod -s /bin/bash austin