Categories
homelab Linux proxmox

Proxmox Cluster manual update

Recently ran into an issue where I added a node to my Proxmox cluster while a node was disconnected & off. That node (prox) caused the others to become unresponsive for a number of Proxmox things (it was missing from the cluster) upon boot.

Set node to local mode

The solution was to put the node that had been offline (called prox) into “local” mode. Thanks to Nicholas of Technicus / techblog.jeppson.org for the commands to do so:

sudo systemctl stop pve-cluster
sudo /usr/bin/pmxcfs -l

This allows editing of the all-important /etc/pve/corosync.conf file.

Manually update corosync.conf

I basically just had to copy over the config present on the two synchronized nodes to node prox, then reboot. This allowed node prox to join the cluster again and things started working fine.

Problem corosync.conf on node prox:

logging {
  debug: off
  to_syslog: yes
}

nodelist {
  node {
    name: prox
    nodeid: 1
    quorum_votes: 1
    ring0_addr: 10.98.1.14
  }
  node {
    name: prox-1u
    nodeid: 2
    quorum_votes: 3
    ring0_addr: 10.98.1.15
  }
}

quorum {
  provider: corosync_votequorum
}

totem {
  cluster_name: prox-cluster
  config_version: 4
  interface {
    linknumber: 0
  }
  ip_version: ipv4-6
  secauth: on
  version: 2
}

Fancy new corosync.conf on nodes prox-1u and prox-m92p:

logging {
  debug: off
  to_syslog: yes
}

nodelist {
  node {
    name: prox
    nodeid: 1
    quorum_votes: 1
    ring0_addr: 10.98.1.14
  }
  node {
    name: prox-1u
    nodeid: 2
    quorum_votes: 3
    ring0_addr: 10.98.1.15
  }
  node {
    name: prox-m92p
    nodeid: 3
    quorum_votes: 1
    ring0_addr: 10.98.1.92
  }
}

quorum {
  provider: corosync_votequorum
}

totem {
  cluster_name: prox-cluster
  config_version: 5
  interface {
    linknumber: 0
  }
  ip_version: ipv4-6
  secauth: on
  version: 2
}

The difference is that third node item as well as incrementing the config_version from 4 to 5. After I made those changes on node prox and rebooted, things worked fine.