"Too many authentication failures" error connecting to hosts when multiple keys loaded by ssh-agent on Ubuntu

Summary

I get the following ssh connection errors when running detective:

fatal: [13.239.139.206]: UNREACHABLE! => {"changed": false, "msg": "Failed to connect to the host via ssh: Warning: Permanently added '13.239.139.206' (ED25519) to the list of known hosts.\r\nReceived disconnect from 13.239.139.206 port 22:2: Too many authentication failures\r\nDisconnected from 13.239.139.206 port 22", "unreachable": true}

I think these occur because of a combination of:

  • the container has access to my local ssh-agent which makes multiple keys available to be tried
  • the container has RW access to my local .ssh folder, which is not good from a security pov
  • the -o IdentitiesOnly=yes option is not used by ansible (though this wouldn't address the above points)

I believe the docker command should mount only the specified key file in the container - there is no need for it to have access to all of my keys in ~/.ssh or my ssh-agent.

For the record, on my Ubuntu 24.04 laptop the following are enabled/set by default:

jfarmiloe@~/tmp/gitlab-detective$ ps -ef | grep agent
jfarmil+    3865    3356  0 Feb20 ?        00:00:00 /usr/libexec/gcr-ssh-agent --base-dir /run/user/1000/gcr
jfarmil+  195117    3382  0 19:45 ?        00:00:00 /usr/bin/ssh-agent -D -a /run/user/1000/keyring/.ssh

jfarmiloe@~/tmp/gitlab-detective$ env | grep SSH
SSH_AUTH_SOCK=/run/user/1000/keyring/ssh
GSM_SKIP_SSH_AGENT_WORKAROUND=true

The docker command line run by detective when I specify -S /home/jfarmiloe/tmp/gitlab-detective/get-3k-ssh-key is:

docker run --rm --tty --interactive --workdir /runner/project -v /run/user/1000/keyring/:/run/user/1000/keyring/ -e SSH_AUTH_SOCK=/run/user/1000/keyring/ssh -v /home/jfarmiloe/.ssh/:/home/runner/.ssh/ -v /home/jfarmiloe/.ssh/:/root/.ssh/ -v /home/jfarmiloe/tmp/gitlab-detective/private/artifacts/:/runner/artifacts/:Z -v /home/jfarmiloe/tmp/gitlab-detective/private/:/runner/:Z -v /home/jfarmiloe/tmp/gitlab-detective:/working -v /home/jfarmiloe/tmp/gitlab-detective/get-3k-ssh-key:/root/.ssh/get-3k-ssh-key:ro --env-file /home/jfarmiloe/tmp/gitlab-detective/private/artifacts/f03e5bac-119e-4989-b463-b6a9af68ab1c/env.list --user=1000 --name ansible_runner_f03e5bac-119e-4989-b463-b6a9af68ab1c --user=1000 registry.gitlab.com/gitlab-com/support/toolbox/gitlab-detective:latest ansible-playbook /detective/playbooks/playbook.yml -i /working/hosts.yml --ssh-common-args '-o IgnoreUnknown UseKeychain,AddKeysToAgent' --extra-vars '' --private-key /root/.ssh/get-3k-ssh-key 

Note there are 5 key-related options being used:

-v /run/user/1000/keyring/:/run/user/1000/keyring/ 
-e SSH_AUTH_SOCK=/run/user/1000/keyring/ssh
-v /home/jfarmiloe/.ssh/:/home/runner/.ssh/ 
-v /home/jfarmiloe/.ssh/:/root/.ssh/ 
-v /home/jfarmiloe/tmp/gitlab-detective/get-3k-ssh-key:/root/.ssh/get-3k-ssh-key:ro

I also see a zero-byte ~/.ssh/get-3k-ssh-key file owned by root created on my local system when detective runs, which shouldn't happen.

I reproduced the error after manually running the same command run by the playbook, e.g.

ssh -vvv -C -o ControlMaster=auto -o ControlPersist=60s -o StrictHostKeyChecking=no -o Port=22 -o 'IdentityFile="/root/.ssh/get-3k-ssh-key"' -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o 'User="ubuntu"' -o ConnectTimeout=10 -o IgnoreUnknown=UseKeychain,AddKeysToAgent -o 'ControlPath="/runner/.ansible/cp/91aaaacc6b"' 16.176.182.72 sh

which logged multiple keys being tried including several via ssh-agent:

debug1: Will attempt key: jfarmiloe@jfarmiloe-Precision-5550 RSA SHA256:E9xNT35NiW603hkSAGxRB3dkY8IxOpHGMfzwKJLZ4iU agent
debug1: Will attempt key: jfarmiloe@jfarmiloe--20210201-BH593 ED25519 SHA256:uxQPLJX7L/XMoqqVxvIHhyX7gn9oi7GWRAAE9NHLZVc agent
debug1: Will attempt key: jfarmiloe@jfarmiloe--20260218-26364 RSA SHA256:m/kMZIuKHx40ZZjwIdAfNoy8kizcXdVQN+gPfhgJvK4 agent
debug1: Will attempt key: jfarmiloe@jfarmiloe--20260218-26364 ED25519 SHA256:Hfg49ivHimQnFKvAwCmTolg9mAebhGuoyu8QjSdWmt4 agent
debug1: Will attempt key: jfarmiloe@gitlab.com ED25519 SHA256:2fnE4aNsf+GQGZgBkPGGB8HJbeBxoNPs9jOGWINk7V0 agent
debug1: Will attempt key: GDK container access RSA SHA256:BPuOQ9YjSMuVWHLhWdiyluLGTMEylDsdJov3OfUNy+A agent
debug1: Will attempt key: /root/.ssh/get-3k-ssh-key RSA SHA256:p2ICF350mdfYGzuOcaEXwQ23Uk1XqndxQil5nJZMti0 explicit

If I stop ssh-agent on my local machine then ansible is able to connect to the hosts.

Also if I run unset SSH_AUTH_SOCK in the container only the specified key is used.

I'm wondering now if the intention of using ssh-agent with ansible-runner is to allow keys with passphrases to be used by the container without having to specify them in the config (in my testing my GET key has no passphrase)? In which case including the -o IdentitiesOnly=yes option should prevent the Too many authentication failures error occurring when the local agent has several keys loaded. I still think ~/.ssh shouldn't be mounted as a volume.

Steps to reproduce

What is the current bug behavior?

What is the expected correct behavior?

Relevant logs and/or screenshots

Target GitLab environment info

Local Environment Info

Ubuntu 24.04

Possible fixes

Edited by Justin Farmiloe