Pipeline job using resource_group gets stuck
Summary
Problem was occured on the gitlab.com at Jun 4, 2021 7:04pm GMT +0000. Job ID is 1397478939
. Jobs uses resource_group. At some point, the job ended with an error: There has been a structural integrity problem detected, please contact system administrator and This job does not have a trace. After this all next jobs waiting. I found this case in troubleshooting guide. Who can unstuck my linux-x86_64
resource group?
Steps to reproduce
Example Project
What is the current bug behavior?
What is the expected correct behavior?
Relevant logs and/or screenshots
Output of checks
Results of GitLab environment info
SaaS.
Results of GitLab application Check
SaaS.
Possible fixes
Use workaround: https://docs.gitlab.com/ee/ci/troubleshooting.html#console-workaround-if-job-using-resource_group-gets-stuck.
# find resource group by name
resource_group = Project.find_by_full_path('...').resource_groups.find_by(key: 'the-group-name')
busy_resources = resource_group.resources.where('build_id IS NOT NULL')
# identify which builds are occupying the resource
# (I think it should be 1 as of today)
busy_resources.pluck(:build_id)
# it's good to check why this build is holding the resource.
# Is it stuck? Has it been forcefully dropped by the system?
# free up busy resources
busy_resources.update_all(build_id: nil)