Operation 'VirtualMachineScaleSets.virtualMachines.GET' is not allowed on Virtual Machine Scale Set
When trying out the fleeting-plugin-azure I stumble upon this error:
Runtime platform arch=amd64 os=linux pid=7 revision=d89a789a version=16.4.1
Starting multi-runner from /etc/gitlab-runner/config.toml... builds=0 max_builds=0
Running in system-mode.
Created missing unique system ID system_id=r_nYlMEFC3a0DQ
Configuration loaded builds=0 max_builds=10
Metrics server listening address=0.0.0.0:9090 builds=0 max_builds=10
[session_server].listen_address not defined, session endpoints disabled builds=0 max_builds=10
Initializing executor providers builds=0 max_builds=10
2023-10-23T14:20:09.109Z [WARN] plugin: plugin configured with a nil SecureConfig
2023-10-23T14:20:09.109Z [DEBUG] plugin: starting plugin: path=/usr/bin/fleeting-plugin-azure args=["fleeting-plugin-azure"]
2023-10-23T14:20:09.109Z [DEBUG] plugin: plugin started: path=/usr/bin/fleeting-plugin-azure pid=22
2023-10-23T14:20:09.109Z [DEBUG] plugin: waiting for RPC address: path=/usr/bin/fleeting-plugin-azure
2023-10-23T14:20:09.111Z [DEBUG] plugin.fleeting-plugin-azure: plugin address: address=/tmp/plugin2366112291 network=unix timestamp=2023-10-23T14:20:09.111Z
2023-10-23T14:20:09.111Z [DEBUG] plugin: using plugin: version=0
2023-10-23T14:20:09.112Z [TRACE] plugin.stdio: waiting for stdio data
2023-10-23T14:20:10.696Z [INFO] plugin initialized: version=v0.1.0 build info="sha=88a07d32; ref=refs/pipelines/902821968; go=go1.19.6; built_at=2023-06-16T19:00:28+0000; os_arch=linux/amd64"
2023-10-23T14:20:16.619Z [DEBUG] plugin.stdio: received EOF, stopping recv loop: err="rpc error: code = Unavailable desc = error reading from server: EOF"
2023-10-23T14:20:16.621Z [INFO] plugin: plugin process exited: path=/usr/bin/fleeting-plugin-azure pid=22
2023-10-23T14:20:16.621Z [DEBUG] plugin: plugin exited
WARNING: Failed to process runner builds=0 error=failed to update executor: initializing taskscaler: creating taskscaler: initializing provisioner: reconciling with instance group: rpc error: code = Unknown desc = GET https://management.azure.com/subscriptions/aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa/resourceGroups/my-awesome-resourcegroup/providers/Microsoft.Compute/virtualMachineScaleSets/my-awesome-vmss/virtualMachines
--------------------------------------------------------------------------------
RESPONSE 400: 400 Bad Request
ERROR CODE: BadRequest
--------------------------------------------------------------------------------
{
"error": {
"code": "BadRequest",
"message": "Operation 'VirtualMachineScaleSets.virtualMachines.GET' is not allowed on Virtual Machine Scale Set 'my-awesome-vmss'."
}
}
--------------------------------------------------------------------------------
executor=instance max_builds=10 runner=FJsyXeT1b
I'm using an Azure application (respective Service Principal) provided via env vars AZURE_TENANT_ID, AZURE_CLIENT_ID and AZURE_CLIENT_SECRET. It has the Contributor role to the VMSS and the Reader role to the whole ResourceGroup. Querying this endpoint with my personal account (via Azure AD) from the API testing page works without issues however.
If it would be a permission issue, I would expect some 401 issue instead. Maybe some older Azure API version is used and is causing the error? As the endpoint exist for 2023-07-01 which seems rather recent. Unfortunately, the logs don't indicate anything like that.
The config I'm using:
listen_address = "0.0.0.0:9090"
concurrent = 10
check_interval = 5
log_level = "info"
shutdown_timeout = 0
[session_server]
session_timeout = 1800
[[runners]]
name = "instance autoscaler example"
limit = 2
output_limit = 204800
url = "https://example.com/"
id = 48
token = "XXXXXXXXXXXXXXXX"
token_obtained_at = 2023-10-23T12:37:10Z
token_expires_at = 0001-01-01T00:00:00Z
executor = "instance"
[runners.cache]
Type = "azure"
Path = "cache"
Shared = true
MaxUploadedArchiveSize = 0
[runners.cache.azure]
AccountName = "XXXXXXXXXXXXXXXX"
AccountKey = "XXXXXXXXXXXXXXXX"
ContainerName = "shared-cache-data"
# Autoscaler config
[runners.autoscaler]
plugin = "fleeting-plugin-azure"
capacity_per_instance = 5
max_use_count = 1
max_instances = 3
[runners.autoscaler.plugin_config] # plugin specific configuration (see plugin documentation)
subscription_id = "aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa"
resource_group_name = "my-awesome-resourcegroup"
name = "my-awesome-vmss" # VMSS name
#[runners.autoscaler.connector_config]
#username = "azureuser"
#use_external_addr = false
[[runners.autoscaler.policy]]
idle_count = 2
idle_time = "20m0s"
The subscription-id, resource group name, VMSS name and other sensible fields have been altered in the output for privacy.