SSH - passwordless lockout

Passwordless SSH can lock you out

November 8, 2024

If you follow standard security practices, you would not allow root logins, let alone connections over SSH (as with Debian standard install). But this would deem your PVE unable to function properly, so you can only resort to fix your /etc/ssh/sshd_config 1 with the option:

PermitRootLogin prohibit-password

That way, you only allow connections with valid keys (not password). Prior to this, you would have copied over your public keys with ssh-copy-id 2 or otherwise add them to /root/.ssh/authorized_keys.

But this has a huge caveat on any standard PVE install. When you examine the file, it is actually a symbolic link:

/root/.ssh/authorized_keys -> /etc/pve/priv/authorized_keys

This is because there’s already other nodes’ keys there to allow for cross-connecting - and the location is shared. This has several issues, most important of which is that the actual file lies in /etc/pve which is a virtual filesystem 3 mounted only when all goes well during boot-up.

What could go wrong

If your /etc/pve does not get mounted during bootup, your node will appear offline and will not be accessible over SSH, let alone GUI.

Warning

If accessing via other node’s GUI, you will get confusing Permission denied (publickey,password) in the “Shell”.

You are essentially locked-out, despite the system otherwise booted up except for PVE services. You cannot troubleshoot over SSH, you would need to resort to OOB management or physical access.

This is because during your SSH connection, there’s no way to verify your key against the /etc/pve/priv/authorized_keys.

Caution

If you allow root to authenticate also by password, it will lock you out of “GUI only”. Your SSH will not work - obviously - with key, but fallback to password prompt.

How to avoid this

You need to use your own authorized_keys, different from the default that has been hijacked by PVE. The proper way to do this is define its location in the config:

cat > /etc/ssh/sshd_config.d/LocalAuthorizedKeys.conf <<< "AuthorizedKeysFile .ssh/local_authorized_keys"

If you now copy your own keys to /root/.ssh/local_authorized_keys file (on every node), you are immune from this design flaw.

Tip

There are even better ways to approach this, e.g. SSH certificates, in which case you are not prone to encounter this bug for your own setup. This is out of scope for this post.

FAQ

  1. What about non-privileged user & sudo?

This will work just fine, too. Note that PVE does not come with sudo and will nevertheless require root allowed to login over SSH to preserve full features.

  1. Why is this considered a design flaw?

Due to the Proxmox stack setup, inaccessible SSH for root user prevents you to e.g. troubleshoot failing services (when SSH is healthy) even from GUI shell of a healthy node. It is impossible to remove SSH access for root account in Proxmox without losing features, some of which are documented.

Since you cannot disable root over SSH, you might as well embrace it, however if you have another way in through other steps (e.g. FAQ 1), it is just as good (the GUI path will still not work though).

  1. The incidence ratio of system “down” (but has full networking) vs “down down” (when it need to rescue from console / kvm) seems low.

The issue is that failure of pve-cluster service at boot (which needs to run also on standalone nodes) that causes the “lockout” is quite common side effect of e.g. networking misconfiguration or pmxcfs backend-database corruption. They are out of scope of this post, but happen definitely more often than just failing SSH, let alone networking as a whole. Also note that lots of home systems do not have OOB/KVM or even rely entirely on GUI.