Sometimes I make a change to my Proxmox cluster configuration without all nodes in a healthy state (i.e. they are off). This isn’t a great habit to get into and sometimes results in troubleshooting.
Putting a quick post up so I can easily reference how to resolve corosync issues.
# stop corosync and pmxcfs on all nodes $ systemctl stop corosync pve-cluster # start pmxcfs in local mode on all nodes $ pmxcfs -l # put correct corosync config into local pmxcfs and corosync config dir (make sure to bump the 'config_version' inside the config file) $ cp correct_corosync.conf /etc/pve/corosync.conf $ cp correct_corosync.conf /etc/corosync/corosync.conf # kill local pmxcfs $ killall pmxcfs # start corosync and pmxcfs again $ systemctl start pve-cluster corosync # check status $ journalctl --since '-5min' -u pve-cluster -u corosync $ pvecm status
Some errors I got to help with search engines:
ipcc_send_rec failed: Connection refused ipcc_send_rec failed: Connection refused ipcc_send_rec failed: Connection refused Unable to load access control list: Connection refused