[RCA] Omnibus Configuration Issues after upgrading GitLab QA to version 7.x
The newest release of v7.0.0 of GitLab QA caused failures due to duplicate Omnibus Configurations.
Failure 1 (Duplicate entries)
Problem statement This was caused from the way we copy existing Omnibus Configurations into a blank GitLab instance. Instead of making a true "copy" of the existing configuration, it was still referring to the global Omnibus Configuration, resulting in two GitLab instances having the same configuration
Solution gitlab-qa!683 (merged) was placed to remedy, which referred to the global Omnibus Configuration String representation, rather than the Object.
RuntimeError: Errors exist within the Omnibus Configuration!
Duplicate entry found: `postgresql['enable'] = false;
redis['enable'] = false;
nginx['enable'] = false;
prometheus['enable'] = false;
grafana['enable'] = false;
puma['enable'] = false;
sidekiq['enable'] = false;
gitlab_workhorse['enable'] = false;
gitlab_rails['rake_cache_clear'] = false;
gitlab_rails['auto_migrate'] = false;
gitaly['enable'] = false;
praefect['enable'] = true;
praefect['listen_addr'] = '0.0.0.0:2305';
praefect['prometheus_listen_addr'] = '0.0.0.0:9652';
praefect['auth_token'] = 'PRAEFECT_EXTERNAL_TOKEN';
praefect['database_host'] = 'postgres.test';
praefect['database_user'] = 'postgres';
praefect['database_port'] = 5432;
praefect['database_password'] = 'SQL_PASSWORD';
praefect['database_dbname'] = 'praefect_production';
praefect['database_sslmode'] = 'disable';
praefect['postgres_queue_enabled'] = true;
praefect['failover_enabled'] = true;
praefect['virtual_storages'] = {
'default' => {
'gitaly1' => {
'address' => 'tcp://gitaly1.test:8075',
'token' => 'PRAEFECT_INTERNAL_TOKEN',
'primary' => true
},
'gitaly2' => {
'address' => 'tcp://gitaly2.test:8075',
'token' => 'PRAEFECT_INTERNAL_TOKEN'
},
'gitaly3' => {
'address' => 'tcp://gitaly3.test:8075',
'token' => 'PRAEFECT_INTERNAL_TOKEN'
}
}
};`
/builds/gitlab-org/gitlab-qa-mirror/lib/gitlab/qa/runtime/omnibus_configuration.rb:58:in `sanitize!'
/builds/gitlab-org/gitlab-qa-mirror/lib/gitlab/qa/runtime/omnibus_configuration.rb:14:in `to_s'
/builds/gitlab-org/gitlab-qa-mirror/lib/gitlab/qa/component/gitlab.rb:220:in `to_s'
/builds/gitlab-org/gitlab-qa-mirror/lib/gitlab/qa/component/gitlab.rb:166:in `block in setup_omnibus'
/builds/gitlab-org/gitlab-qa-mirror/lib/gitlab/qa/docker/engine.rb:51:in `write_files'
/builds/gitlab-org/gitlab-qa-mirror/lib/gitlab/qa/component/gitlab.rb:165:in `setup_omnibus'
/builds/gitlab-org/gitlab-qa-mirror/lib/gitlab/qa/component/gitlab.rb:127:in `reconfigure'
/builds/gitlab-org/gitlab-qa-mirror/lib/gitlab/qa/component/base.rb:150:in `instance_no_teardown'
/builds/gitlab-org/gitlab-qa-mirror/lib/gitlab/qa/component/base.rb:45:in `instance'
/builds/gitlab-org/gitlab-qa-mirror/lib/gitlab/qa/scenario/test/integration/gitaly_cluster.rb:44:in `block in perform'
/builds/gitlab-org/gitlab-qa-mirror/lib/gitlab/qa/scenario/test/integration/gitaly_cluster.rb:36:in `tap'
/builds/gitlab-org/gitlab-qa-mirror/lib/gitlab/qa/scenario/test/integration/gitaly_cluster.rb:36:in `perform'
/builds/gitlab-org/gitlab-qa-mirror/lib/gitlab/qa/scenario/template.rb:8:in `block in perform'
/builds/gitlab-org/gitlab-qa-mirror/lib/gitlab/qa/scenario/template.rb:6:in `tap'
/builds/gitlab-org/gitlab-qa-mirror/lib/gitlab/qa/scenario/template.rb:6:in `perform'
/builds/gitlab-org/gitlab-qa-mirror/lib/gitlab/qa/runner.rb:71:in `run'
exe/gitlab-qa:8:in `<top (required)>'
Failing job: https://gitlab.com/gitlab-org/gitlab-qa-mirror/-/jobs/1129559804
Failure 2 (LDAPTLS / LDAPNoTLS being unable to have a properly configured /etc/gitlab/gitlab.rb)
Problem Statement: This issue popped up because the LDAP DSL was unable to be properly formed.
Runtime::OmnibusConfiguration#sanitize!
is a method dedicated to making the Omnibus Configuration to be well-formed. SinceComponent::Gitlab
had its own custom class (sort-of a slimmed down version ofRuntime::OmnibusConfiguration
), it did not have this method. Thissanitize!
method converted all Double Quotes ("
) to Single Quotes ('
), which is very important considering how we execute adocker exec
against the GitLab docker container and it requires the command to be well-formed.
Solution: gitlab-qa!684 (merged) was placed to remedy. TheRuntime::OmnibusConfiguration#initialize
method was overloaded to allow for an existing configuration to be prepended to the object.Component::Gitlab#omnibus_configuration
was then updated to instantiate a new Object ofRuntime::OmnibusConfiguration
rather than a barebones custom class specified in theComponent::Gitlab
class.
================================================================================
Recipe Compile Error in /opt/gitlab/embedded/cookbooks/cache/cookbooks/gitlab-ee/recipes/default.rb
================================================================================
SyntaxError
-----------
/etc/gitlab/gitlab.rb:7: syntax error, unexpected ',', expecting =>
..., bind_dn=>cn=admin,dc=example,dc=org, password=>admin, encr...
... ^
/etc/gitlab/gitlab.rb:7: syntax error, unexpected =>, expecting end-of-input
...n,dc=example,dc=org, password=>admin, encryption=>plain, ver...
... ^~
Cookbook Trace:
---------------
/opt/gitlab/embedded/cookbooks/cache/cookbooks/package/libraries/helpers/settings_helper.rb:106:in `block in from_file'
/opt/gitlab/embedded/cookbooks/cache/cookbooks/package/libraries/config_mash.rb:29:in `auto_vivify'
/opt/gitlab/embedded/cookbooks/cache/cookbooks/package/libraries/helpers/settings_helper.rb:106:in `from_file'
/opt/gitlab/embedded/cookbooks/cache/cookbooks/package/libraries/helpers/settings_helper.rb:106:in `block in from_file'
/opt/gitlab/embedded/cookbooks/cache/cookbooks/package/libraries/config_mash.rb:29:in `auto_vivify'
/opt/gitlab/embedded/cookbooks/cache/cookbooks/package/libraries/helpers/settings_helper.rb:106:in `from_file'
/opt/gitlab/embedded/cookbooks/cache/cookbooks/gitlab/recipes/config.rb:22:in `from_file'
/opt/gitlab/embedded/cookbooks/cache/cookbooks/gitlab/recipes/default.rb:26:in `from_file'
/opt/gitlab/embedded/cookbooks/cache/cookbooks/gitlab-ee/recipes/default.rb:20:in `from_file'
Relevant File Content:
----------------------
/opt/gitlab/embedded/cookbooks/cache/cookbooks/package/libraries/helpers/settings_helper.rb
Failures in ee:ldap_no_server
, ee:ldap_no_tls
and ee:ldap_tls
.
Failure 3 (using mTLS unable to push to Gitaly)
Problem Statement: it appears that the Gitaly clusters are not configuring correctly, resulting in a 500 Internal Server Error.
Cloning into '.'...
remote: Internal server error
fatal: unable to access 'https://root@gitlab.test/gitlab-qa-sandbox-group/qa-test-2021-03-28-18-10-21-bdd6a4cb2ffb2154/mTLS-40d53383665acfef.git/': The requested URL returned error: 500
RSpec::Retry: 2nd try ./qa/specs/features/api/3_create/gitaly/gitaly_mtls_spec.rb:11
Failures:
1) Create Gitaly Using mTLS pushes to gitaly
Failure/Error:
Resource::Repository::ProjectPush.fabricate! do |push|
push.project = project
push.new_branch = false
push.commit_message = first_added_commit_message
push.file_content = 'First commit'
end
QA::Support::Run::CommandError:
The command HOME="/tmp/qa-netrc-credentials/20" git clone https://root@gitlab.test/gitlab-qa-sandbox-group/qa-test-2021-03-28-18-10-21-bdd6a4cb2ffb2154/mTLS-cfecbbc88c93bd73.git ./ 2>&1 failed (128) with the following output:
Cloning into '.'...
remote: Internal server error
fatal: unable to access 'https://root@gitlab.test/gitlab-qa-sandbox-group/qa-test-2021-03-28-18-10-21-bdd6a4cb2ffb2154/mTLS-cfecbbc88c93bd73.git/': The requested URL returned error: 500
# ./qa/support/run.rb:36:in `run'
# ./qa/git/repository.rb:307:in `run_git'
# ./qa/git/repository.rb:65:in `clone'
# ./qa/resource/repository/push.rb:77:in `block in fabricate!'
# ./qa/scenario/actable.rb:16:in `perform'
# ./qa/git/repository.rb:33:in `block (2 levels) in perform'
# ./qa/git/repository.rb:33:in `chdir'
# ./qa/git/repository.rb:33:in `block in perform'
# ./qa/git/repository.rb:32:in `perform'
# ./qa/resource/repository/push.rb:50:in `fabricate!'
# ./qa/resource/repository/project_push.rb:43:in `fabricate!'
# ./qa/resource/base.rb:30:in `block (2 levels) in fabricate_via_browser_ui!'
# ./qa/resource/base.rb:135:in `log_fabrication'
# ./qa/resource/base.rb:30:in `block in fabricate_via_browser_ui!'
# ./qa/resource/base.rb:118:in `do_fabricate!'
# ./qa/resource/base.rb:29:in `fabricate_via_browser_ui!'
# ./qa/resource/base.rb:21:in `rescue in fabricate!'
# ./qa/resource/base.rb:18:in `fabricate!'
# ./qa/specs/features/api/3_create/gitaly/gitaly_mtls_spec.rb:17:in `block (4 levels) in <module:QA>'
# /usr/local/bundle/gems/rspec-retry-0.6.1/lib/rspec/retry.rb:123:in `block in run'
# /usr/local/bundle/gems/rspec-retry-0.6.1/lib/rspec/retry.rb:110:in `loop'
# /usr/local/bundle/gems/rspec-retry-0.6.1/lib/rspec/retry.rb:110:in `run'
# /usr/local/bundle/gems/rspec-retry-0.6.1/lib/rspec_ext/rspec_ext.rb:12:in `run_with_retry'
# ./spec/spec_helper.rb:78:in `block (2 levels) in <top (required)>'
# /usr/local/bundle/gems/rspec-retry-0.6.1/lib/rspec/retry.rb:123:in `block in run'
# /usr/local/bundle/gems/rspec-retry-0.6.1/lib/rspec/retry.rb:110:in `loop'
# /usr/local/bundle/gems/rspec-retry-0.6.1/lib/rspec/retry.rb:110:in `run'
# /usr/local/bundle/gems/rspec-retry-0.6.1/lib/rspec_ext/rspec_ext.rb:12:in `run_with_retry'
# /usr/local/bundle/gems/rspec-retry-0.6.1/lib/rspec/retry.rb:37:in `block (2 levels) in setup'
# ./qa/specs/runner.rb:73:in `perform'
# ./qa/scenario/template.rb:10:in `block in perform'
# ./qa/scenario/template.rb:8:in `tap'
# ./qa/scenario/template.rb:8:in `perform'
# ./qa/scenario/template.rb:35:in `perform'
# ./qa/scenario/template.rb:10:in `block in perform'
# ./qa/scenario/template.rb:8:in `tap'
# ./qa/scenario/template.rb:8:in `perform'
# ./qa/scenario/bootable.rb:28:in `launch!'
# ------------------
# --- Caused by: ---
# NotImplementedError:
# NotImplementedError
# ./qa/resource/base.rb:41:in `fabricate_via_api!'
Finished in 21.07 seconds (files took 29.71 seconds to load)
1 example, 1 failure
Failed examples:
rspec ./qa/specs/features/api/3_create/gitaly/gitaly_mtls_spec.rb:11 # Create Gitaly Using mTLS pushes to gitaly
Seen in https://gitlab.com/gitlab-org/gitlab-qa-mirror/-/jobs/1134720346
Problem 4 (Object Storage and Packages Omnibus Configuration is not loading in correctly)
Problem Statement: Upon first glance, it seems that the Omnibus Configuration isn't being specified correctly from
.gitlab-ci.yml
. What should be passed through is--omnibus-config object_storage
, but it looks like--omnibus-config
is not being passed, onlyobject_storage
- resulting in GitLab QA attempting to load in that as a Scenario. It appears thatspecs.rb
is taking out the--omnibus-config
directive due to faulty logic with how it parses feature flags. Solution: Added more robust logic to the Runner to ensure that options passed into GitLab QA that had arguments (e.g.,--omnibus-config
) also applies to the argument as well. (e.g.,--omnibus-config default
) Without gitlab-qa!686 (merged) , thedefault
argument passed to GitLab QA was also being passed through to the RSpec runner, resulting in the problem.
An error occurred while loading ./object_storage.
Failure/Error: __send__(method, file)
LoadError:
cannot load such file -- /home/gitlab/qa/object_storage
# ./qa/specs/runner.rb:73:in `perform'
# ./qa/scenario/template.rb:10:in `block in perform'
# ./qa/scenario/template.rb:8:in `tap'
# ./qa/scenario/template.rb:8:in `perform'
# ./qa/scenario/template.rb:35:in `perform'
# ./qa/scenario/template.rb:10:in `block in perform'
# ./qa/scenario/template.rb:8:in `tap'
# ./qa/scenario/template.rb:8:in `perform'
# ./qa/scenario/bootable.rb:28:in `launch!'
Run options: exclude {:orchestrated=>true, :transient=>true, :geo=>true, :requires_praefect=>true}
Finished in 0.00015 seconds (files took 32.44 seconds to load)
0 examples, 0 failures, 1 error occurred outside of examples
Seen in ee:object_storage
, ee:packages
Problem 5 (object_storage and package tests timing out)
Problem Statement: This problem stemmed due to the fact that we removed the
Test::Integration::Packages
andTest::Integration::ObjectStorage
scenarios. These scenarios had specified thepackages
andobject_storage
tags respectively limiting the test scope. After gitlab-qa@096545db was commited , this no longer limited the scope meaning that "all" tests were running within these twoee:packages
andee:object_storage
jobs resulting in a timeout. Solution: We needed to specify which tags to run. These tags are specified in.gitlab-ci.yml
and ultimately passed in as an RSpec parameter to the RSpec runner.
Duration: 120 minutes 28 seconds
Timeout: 2h (from project)
Job's log exceeded limit of 4194304 bytes.
Seen in #326526 (closed)