Reopen - #32204 - Error if run git push: remote: /opt/gitlab/embedded/lib/ruby/2.3.0/logger.rb:703:in `initialize': Permission denied @ rb_sysopen - /var/log/gitlab/gitlab-shell/gitlab-shell.log (Errno::EACCES)

This is a vote for re-opening gitlab-ce#32204

The bot issued the standard 'inactivity warning' at midnight ET on a Monday and then the issue was closed by Gitlab, just 7 hours later - before anyone could opt in to say it was still a problem. Seems like the wait should be a bare minimum of 24 hours.

We still experienced this on 11.5.5 (and since about 10.0.x) When it happens it appears to be within the first set of containers brought up after an upgrade of the Gitlab version via taking a container version (but not on every upgrade). We bring up a special container to do the database migration bit - so maybe it is related to new containers against an old schema + (database migration while they are standing) + (some other condition - since it does not happen every upgrade). Sometimes the ASG (AWS ECS) seems to detect and terminate them but not reliably.

We have specific monitoring for the message in gitlab-ce#32204 with a pager alert.

We have spent a lot of time doing root cause and have not been able to crack it due to its inconsistency and the effect it has on the stack when it happens (must be resolved quickly).

The warning in the code to run a repermissioning script does not inspire confidence because, aside from the fact we don't want to be doing manual tweaks to an HA setup, it does not result in the same permissions as when the containers come up properly.

I suspect that the repermissioning script is not long-term maintained to exactly mirror the permissions that the stack actually generates on a proper bring up. Maybe it is stuck with a static snapshot of the permissions when the script was built by Gitlab.

Most importantly the existence of the script seems to indicate a mystery in the code that Gitlab does not understand well enough to solve at its root. It would be nice to find and eliminate the root problem.

We are still experiencing the mystery, and when we do - our HA Gitlab stack looks very bad :(

/cc @aolson, @Steevo https://gitlab.my.salesforce.com/0016100000Kvaln

Edited Feb 04, 2019 by DarwinJS
Assignee Loading
Time tracking Loading