Runner stuck on pending and after "Successfully extracted cache"

Issue from here

Adding it here since this is the appropriate project?

Summary

Runner keeps getting stuck on pending, even on new standalone instance with no other jobs.
Note: using Docker executor with gitlab-runner 10.8.0.

It also gets stuck running right in the middle of a job, usually after Successfully extracted cache

Steps to reproduce

Nothing special, just keep trying to run jobs, it happens quite frequently

Output of checks

Results of GitLab environment info

Expand for output related to GitLab environment info

System information System: Ubuntu 18.04 Proxy: no Current User: git Using RVM: no Ruby Version: 2.3.7p456 Gem Version: 2.6.14 Bundler Version:1.13.7 Rake Version: 12.3.1 Redis Version: 3.2.11 Git Version: 2.16.3 Sidekiq Version:5.0.5 Go Version: unknown

GitLab information Version: 10.8.1-ee Revision: 921025f5ffa Directory: /opt/gitlab/embedded/service/gitlab-rails DB Adapter: postgresql DB Version: 9.6.8 URL: https://git.company.com HTTP Clone URL: https://git.company.com/some-group/some-project.git SSH Clone URL: git@git.company.com:some-group/some-project.git Elasticsearch: no Geo: no Using LDAP: no Using Omniauth: no

GitLab Shell Version: 7.1.2 Repository storage paths:

  • default: /var/opt/gitlab/git-data/repositories Hooks: /opt/gitlab/embedded/service/gitlab-shell/hooks Git: /opt/gitlab/embedded/bin/git

Results of GitLab application Check

Expand for output related to the GitLab application check

Checking GitLab Shell ...

GitLab Shell version >= 7.1.2 ? ... OK (7.1.2) Repo base directory exists? default... yes Repo storage directories are symlinks? default... no Repo paths owned by git:root, or git:git? default... yes Repo paths access is drwxrws---? default... yes hooks directories in repos are links: ... 4/8 ... ok 4/10 ... ok 4/12 ... ok 4/14 ... ok Running /opt/gitlab/embedded/service/gitlab-shell/bin/check Check GitLab API access: OK Redis available via internal API: OK

Access to /var/opt/gitlab/.ssh/authorized_keys: OK gitlab-shell self-check successful

Checking GitLab Shell ... Finished

Checking Sidekiq ...

Running? ... yes Number of Sidekiq processes ... 1

Checking Sidekiq ... Finished

Reply by email is disabled in config/gitlab.yml Checking LDAP ...

LDAP is disabled in config/gitlab.yml

Checking LDAP ... Finished

Checking GitLab ...

Git configured correctly? ... yes Database config exists? ... yes All migrations up? ... yes Database contains orphaned GroupMembers? ... no GitLab config exists? ... yes GitLab config up to date? ... yes Log directory writable? ... yes Tmp directory writable? ... yes Uploads directory exists? ... yes Uploads directory has correct permissions? ... yes Uploads directory tmp has correct permissions? ... yes Init script exists? ... skipped (omnibus-gitlab has no init script) Init script up-to-date? ... skipped (omnibus-gitlab has no init script) Projects have namespace: ... 4/8 ... yes 4/10 ... yes 4/12 ... yes 4/14 ... yes Redis version >= 2.8.0? ... yes Ruby version >= 2.3.5 ? ... yes (2.3.7) Git version >= 2.9.5 ? ... yes (2.16.3) Git user has default SSH configuration? ... yes Active users: ... 2 Elasticsearch version 5.1 - 5.5? ... skipped (elasticsearch is disabled)

Checking GitLab ... Finished


Not sure if this is related, but I had spent many hours in frustration trying to inject an ssh private key into the docker executor, as instructed here. I had already successfully done this many times, but for some reason it wasn't working.

After sleeping on it, I looked at my last failed job (timeout after 100 minutes), and tried to run it again. It worked. The funny thing is I had not changed my .gitlab-ci.yml file at all, and it had already failed many times with GitLab: The project you were looking for could not be found. Not sure if it's related to the stuck on pending issue or not.

My .gitlab-ci.yml file:

image: google/dart:1.24.3

variables:
  REGISTRY: git.company.com:4567
  GIT_SUBMODULE_STRATEGY: recursive

cache:
  paths:
    - path/to/stuff/

before_script:
  - mkdir -p ~/.ssh
  - echo "$SSH_PRIVATE_KEY" | tr -d '\r' > ~/.ssh/id_rsa
  - chmod 700 ~/.ssh/id_rsa
  - eval "$(ssh-agent -s)"
  - ssh-add ~/.ssh/id_rsa
  - ssh-keyscan -H 'git.company.com' >> ~/.ssh/known_hosts

types:
  - analyze

test_main:
  type: analyze
  script:
    - cd main
    - pub get
    - dartanalyzer --fatal-hints --fatal-warnings web/main.dart

Log on fail:

Running with gitlab-runner 10.8.0 (079aad9e)
  on docker 5fa73ab5
Using Docker executor with image google/dart:1.24.3 ...
Pulling docker image google/dart:1.24.3 ...
Using docker image sha256:ec4b124b54db920b6082d8b6d754b0329278c6deb798f9af1f5b1f7fda2c99de for google/dart:1.24.3 ...
Running on runner-5fa73ab5-project-12-concurrent-0 via gitlab...
Fetching changes...
HEAD is now at 06796d55 Remove old ssh key
From https://git.company.com/group/project
   06796d55..abceee93  ci-mods    -> origin/ci-mods
Checking out abceee93 as ci-mods...
Updating/initializing submodules recursively...
Synchronizing submodule url for 'other-project'
Entering 'other-project'
HEAD is now at e6ca36d My commit message
Checking cache for default-1...
Successfully extracted cache
$ which ssh-agent || ( apt-get update -y && apt-get install openssh-client -y )
/usr/bin/ssh-agent
$ eval $(ssh-agent -s)
Agent pid 13
$ echo "$SSH_PRIVATE_KEY" | tr -d '\r' | ssh-add - > /dev/null
Identity added: (stdin) (rsa w/o comment)
$ mkdir -p ~/.ssh
$ chmod 700 ~/.ssh
$ ssh-keyscan 'git.company.com' >> ~/.ssh/known_hosts
# git.company.com SSH-2.0-OpenSSH_7.6p1 Ubuntu-4
# git.company.com SSH-2.0-OpenSSH_7.6p1 Ubuntu-4
# git.company.com SSH-2.0-OpenSSH_7.6p1 Ubuntu-4
$ chmod 644 ~/.ssh/known_hosts
$ cd main
$ pub get
Resolving dependencies...
Git error. Command: git clone --mirror git@git.company.com:group/other-project /root/.pub-cache/git/cache/other-project-89f06eef00baf7ccb8eff6b5c0fcd5f84a517fb3
Cloning into bare repository '/root/.pub-cache/git/cache/other-project-89f06eef00baf7ccb8eff6b5c0fcd5f84a517fb3'...
GitLab: The project you were looking for could not be found.
fatal: Could not read from remote repository.

Please make sure you have the correct access rights
and the repository exists.
ERROR: Job failed: exit code 1

To be clear, the project it was failing on is on the same standalone Gitlab server.

This is pernicious behavior, not the least because it violates Einstein's quote:

Insanity: doing the same thing over and over again and expecting different results.

Update: Made small change and build is again stuck on Successfully extracted cache, 😠

Unfortunately, this is holding up our entire development.

Edited by 🤖 GitLab Bot 🤖