For some extra context... when using the docker (with bridge networking) executor with a gitlab runner and you need to interact with AWS EC2 metadata (IMDS v2) it will cause a ~20 second delay as its unable to interact with the IMDSv2 endpoint as the default hop limit is 1, with docker bridge networking (NAT) the hop limit (HttpPutResponseHopLimit) needs to be raised to 2 as the default is 1. What needs to be implemented is to ability to set InstanceMetadataOptions on the AWS driver for docker-machine so the newly launched EC2 instances can get a HttpPutResponseHopLimit of >= 2.
Any chance we could revisit this issue? We would like to avoid having our customer run V1 as an exception - but have no choice until V2 is supported.
Based on what I'm hearing - this might be a low hanging fruit that would help avoid customers running our docker-machine in AWS as an exception to their security policies.
Is SDK update enough to start supporting AWS EC2 IMDSv2? Do we know which SDK version introduces it? I see the newest release at https://github.com/aws/aws-sdk-go is 1.36.4 and the MR points 1.36.3, so I'd assume it should already support this instance type, right?
Is SDK update enough to start supporting AWS EC2 IMDSv2? Do we know which SDK version introduces it? I see the newest release at https://github.com/aws/aws-sdk-go is 1.36.4 and the MR points 1.36.3, so I'd assume it should already support this instance type, right?
@erushton fyi - I am adding the workflow::start label to this docker+machine issue for customers on AWS so that we can consider it for prioritization probably in Q1.
While !47 (merged) added support in SDK, it still will default to creating machine instances with optional token. This means that if the account doesn't allow IMDSv1 then docker-machine won't be able to spawn new instances.
I think the drop-down from the EC2 console will clarify:
Bumping the SDK version allowed us to run docker-machine commands from an instance with "token required". However, the instances it creates will default to have "token optional". In the specific case we're working with, the account won't allow such instances to be instantiated.
I created a draft MR hardcoding it to "token required" to investigate further, it might also help to clarify things: !49 (diffs)
If the state is optional, you can choose to retrieve instance metadata with or without a signed token header on your request. If you retrieve the IAM role credentials without a token, the version 1.0 role credentials are returned. If you retrieve the IAM role credentials using a valid signed token, the version 2.0 role credentials are returned.
If the state is required, you must send a signed token header with any instance metadata retrieval requests. In this state, retrieving the IAM role credentials always returns the version 2.0 credentials; the version 1.0 credentials are not available.
It seems to me that this is a client issue with how the IAM role credentials are retrieved? The current state means that either V1 or V2 can be used. !49 (merged) will just require anything running on the EC2 instance to use V2. That seems like it should be a parameter.
IAM policies and SCPs: You can use an IAM condition to enforce that IAM users can't launch an instance unless it uses IMDSv2. You can also use IAM conditions to enforce that IAM users can't modify running instances to re-enable IMDSv1, and to enforce that the instance metadata service is available on the instance.
Thanks for the pointer about the hop limit, as soon as I manage to overcome my golang limitations I'll add that to the MR!
@tpresa First of all, thank you for your contribution.
I just now tried the addition, setting the following docker-machine options with the latest docker-machine version 0.16.2-gitlab.11 for gitlab-runner and docker-machine executor:
Upon monitoring the runners being spawned, i noticed that the Cloudwatch metric MetadataNoToken is not zero, therefore metadata request without IMDSv2 being performed. Additionally, the security hub to monitor such instances is complaining.
I have not tried it standalone, without gilab-runner, but i would guess that the parameters are passed right away.
Anything i'm doing wrong? Is the addition available?
As for the config, the only thing I see is that you have a oddly-positionend close quote on "amazonec2-metadata-token-response-hop-limit"=2,, but I'm not sure that would account for the behavior you're seeing. Could you please try fixing that?
If you have access, you can also contact support and have a support engineer help you troubleshoot: https://about.gitlab.com/support/. I'll do my best to try your config later though.