AWS: wait for cloud-init to finish
Problem description
I noticed that the current implementation (docker-machine version 0.16.2-gitlab.4, build 8e2b58b4; no change in gitlab.5 according to the history) of docker-machine doesn't wait for cloud-init to finish but only waits for SSH to be available.
This leads to several problems:
- The wrong mirrors are being used for installing packages, because cloud-init didn't finish to replace the AMI's apt repositories with AWS mirrors yet.
- Updating the old repos occasionally fail before the cloud-init changes are applied, since validation of the repo cryptography is not completely set up some times.
As far as I know, it is not possible to handle that inside the user-data of the machine, since stopping SSH there will be run too late in the process.
Bug fix / Feature request
It would be great if you could (optionally, via a flag) wait for cloud-init to finish before provisioning the newly create machine.
One way to check is might be to test for the existence of the file /var/lib/cloud/instance/boot-finished
.
Example
In my example, I use the latest Ubuntu 18.04 AMI as a base. I noticed that old repos are being seen when provisioning with docker-machine
:
…
About to run SSH command:
sudo apt-get update
SSH cmd err, output: <nil>: Get:1 http://security.ubuntu.com/ubuntu bionic-security InRelease [88.7 kB]
Hit:2 http://archive.ubuntu.com/ubuntu bionic InRelease
Get:3 http://archive.ubuntu.com/ubuntu bionic-updates InRelease [88.7 kB]
Get:4 http://archive.ubuntu.com/ubuntu bionic-backports InRelease [74.6 kB]
Get:5 http://security.ubuntu.com/ubuntu bionic-security/main amd64 Packages [814 kB]
…
Get:27 http://archive.ubuntu.com/ubuntu bionic-backports/universe amd64 Packages [7736 B]
Get:28 http://archive.ubuntu.com/ubuntu bionic-backports/universe Translation-en [4588 B]
Fetched 19.1 MB in 4s (4286 kB/s)
Reading package lists...
But cloud-init will actually change the repos to a smaller and faster subset via their mirrors afterwards:
Hit:1 http://eu-west-1.ec2.archive.ubuntu.com/ubuntu bionic InRelease
Hit:2 http://eu-west-1.ec2.archive.ubuntu.com/ubuntu bionic-updates InRelease
Hit:3 http://eu-west-1.ec2.archive.ubuntu.com/ubuntu bionic-backports InRelease
Hit:4 https://download.docker.com/linux/ubuntu bionic InRelease
Hit:5 http://security.ubuntu.com/ubuntu bionic-security InRelease
Notes
This race condition might also happen on other platforms where cloud-init is being used. Adding this might fix some weird behavior on creating machines for other as well.