How to troubleshoot being able to upload logs to minio but not artifacts?
Everyone can contribute. Help move this issue forward while earning points, leveling up and collecting rewards.
Summary
We host a minio tenant for our Gitlab with self-signed certificates (all on the same Kubernetes cluster), which we can upload runner-cache and job logs just fine, but uploading artifacts fails with a 500 Internal Server Error. The job.log is available in the Minio Console, but fetching the logs in Gitlab after a while results in another 500, this time with ActionView::Template::Error (SSL_connect returned=1 errno=0 state=error: certificate verify failed (self signed certificate)).
Steps to reproduce
Gitlab is installed on RKE2 with Kubernetes 1.27 on-prem together with MetalLB, Nginx, linkerd and Cilium. Before this, we ran the same setup on EKS, i.e. same Gitlab version, Traefik instead of Nginx, linkerd+cilium but with Let's Encrypt Certificates. Instead of Longhorn, Minio and MetalLB we used the AWS equivalents. Everything worked fine there. Then we imported a backup from S3 into toolbox and used backup-utility --restore -f Backup_file. The kubernetes secrets stayed exactly the same, only the S3 Access Key has been created anew.
We have deployed everything through Helm, SSL Certificates are now created through an intermediate CA via smallstep/step-issuer, the certificate is saved into every node, Gitlab, Gitlab Runner as seen in the values.yaml below.
gitlab-values.yaml (Chart v6.9.8, App v15.9.8):
global:
edition: ce
certificates:
customCAs:
- secret: devopsautomation-intermediate-ca
- secret: gitlab.infra.devopsautomation.domain.com-tls
hosts:
domain: infra.devopsautomation.domain.com
https: true # true: NGINX https: nginx chart needs to be installed
gitlab:
https: true # true: Gitlab https: gitlab.domain.com will use https:// instead of http://
task-runner:
registry:
enabled: false
ingress:
class: nginx
annotations:
cert-manager.io/cluster-issuer: step-issuer
nginx.ingress.kubernetes.io/x-forwarded-proto: "https"
nginx.ingress.kubernetes.io/x-forwarded-ssl: "on"
tls:
enabled: true
secretName: gitlab.infra.devopsautomation.domain.com-tls
configureCertmanager: false
gitlab:
resources:
requests:
cpu: 50m
initialRootPassword:
secret: gitlab-gitlab-initial-root-password
key: password
gitaly:
enabled: true
authToken:
secret: gitlab-gitaly-secret
key: token
persistence:
accessMode: ReadWriteOnce
size: 15Gi
tls:
enabled: false # tls between Gitlab components
minio:
enabled: false
appConfig:
defaultProjectsFeatures:
containerRegistry: false
object_store:
enabled: false
directUpload: true
connection:
secret: gitlab-object-store
key: connection
lfs:
enabled: false
artifacts:
bucket: devopsautomation-gitlab-object-store
proxy_download: false
connection:
secret: gitlab-object-store
key: connection
uploads:
enabled: true
proxy_download: false
bucket: devopsautomation-gitlab-object-store
connection:
secret: gitlab-object-store
key: connection
packages:
enabled: false
backups:
bucket: devopsautomation-gitlab-backup-storage
tmpBucket: devopsautomation-gitlab-tmp-storage
ldap:
preventSignin: true
omniauth:
enabled: true
# autoSignInWithProvider: openid_connect
syncProfileFromProvider: [openid_connect]
syncProfileAttributes: ['openid', 'email', 'profile']
allowSingleSignOn: [openid_connect]
admin_groups: ['devopsautomation-admins']
blockAutoCreatedUsers: false # https://docs.gitlab.com/ee/user/admin_area/settings/sign_up_restrictions.html
autoLinkUser: [openid_connect]
providers:
- secret: openid-connect
key: provider
initialDefaults:
signupEnabled: false
railsSecrets:
secret: gitlab-rails-secret
registry:
enabled: false
pages:
objectStore:
enabled: false
runner:
registrationToken:
secret: gitlab-gitlab-runner-secret
webservice:
nodeSelector:
name: infra
registry:
enabled: false
ingress:
tls:
enabled: true
secretName: gitlab.infra.devopsautomation.domain.com-tls
workerTimeout: 60
certmanager:
installCRDs: false
install: false
nginx-ingress:
enabled: false
prometheus:
install: false
redis:
auth:
existingSecret: gitlab-redis-secret
existingSecretKey: secret
usePasswordFiles: true
master:
persistence:
accessMode: ReadWriteOnce
size: 25Gi
postgresql:
persistence:
accessMode: ReadWriteOnce
size: 20Gi
registry:
enabled: false
gitlab-runner:
# runnerToken: "gitlab-gitlab-runner-secret" # https://gitlab.com/gitlab-org/gitlab/-/issues/10352
# secret: gitlab-gitlab-runner-secret
unregisterRunners: true
# gitlabUrl: https://gitlab-webservice-default.gitlab.svc.cluster.local:8181
gitlabUrl: https://gitlab.infra.devopsautomation.domain.com
certsSecretName: devopsautomation-intermediate-ca
install: true
concurrent: 2
rbac:
create: true
runners:
locked: false
config: |
[[runners]]
[runners.kubernetes]
image = "ubuntu:22.04"
cpu_limit = "0.6"
memory_limit = "3Gi"
service_cpu_limit = "0.6"
poll_timeout = 240
output_limit = 10240
[runners.kubernetes.volumes]
[[runners.kubernetes.volumes.host_path]]
name = "devopsautomation-intermediate-ca"
mount_path = "/etc/gitlab-runner/certs/gitlab.infra.devopsautomation.domain.com.crt"
read_only = true
host_path = "/certs/gitlab.infra.devopsautomation.domain.com.crt"
[runners.cache]
Type = "s3"
Path = "runner"
Shared = true
[runners.cache.s3]
ServerAddress = "api.s3.minio.infra.devopsautomation.domain.com"
BucketName = "devopsautomation-gitlab-runner-cache"
BucketLocation = "eu-central-1"
Insecure = true
AuthenticationType = "access-key"
cache:
secretName: s3-access-key
gitlab:
toolbox:
extraEnvFrom:
AWS_ACCESS_KEY_ID:
secretKeyRef:
name: gitlab-minio-access-key-secret
key: accesskey
AWS_SECRET_ACCESS_KEY:
secretKeyRef:
name: gitlab-minio-access-key-secret
key: secretkey
extraEnv:
AWS_DEFAULT_REGION: eu-west-1
AWS_CA_BUNDLE: /etc/ssl/certs/gitlab.infra.devopsautomation.domain.com.crt
backups:
objectStorage:
config:
secret: backup-storage-config
key: config
cron:
enabled: true
concurrencyPolicy: Replace
failedJobsHistoryLimit: 1
schedule: "0 20 * * *"
successfulJobsHistoryLimit: 3
suspend: false
backoffLimit: 6
restartPolicy: "OnFailure"
enabled: true
migrations:
enabled: true
webservice:
enabled: true
registry:
enabled: false
sidekiq:
registry:
enabled: false
enabled: true
replicas: 1
storage.config (in s3cmd syntax to create backup config):
[default]
access_key = MINIO_ACCESS_KEY
secret_key = MINIO_SECRET_KEY
bucket_location = eu-central-1
multipart_chunk_size_mb = 128
host = api.s3.minio.infra.devopsautomation.domain.com
endpoint = http://api.s3.minio.infra.devopsautomation.domain.com
aws_signature_version = 4
enable_multipart = True
use_https = False
object-store_s3_creds.yaml to create gitlab-object-store connection:
provider: AWS
aws_access_key_id: MINIO_ACCESS_KEY
aws_secret_access_key: MINIO_SECRET_KEY
region: eu-central-1
host: api.s3.minio.infra.devopsautomation.domain.com
path_style: true
# Use this to:
# kubectl create secret generic gitlab-object-store -n gitlab --from-file=connection=./object-store_s3_creds.yaml
The buckets in Minio are created beforehand and the access key has enough permissions to upload cache as well as job logs.
Backups also fail with:
Bucket not found: registry. Skipping backup of registry ...
Bucket not found: devopsautomation-gitlab-object-store. Skipping backup of uploads ...
Bucket not found: devopsautomation-gitlab-object-store. Skipping backup of artifacts ...
Bucket not found: git-lfs. Skipping backup of lfs ...
Bucket not found: gitlab-packages. Skipping backup of packages ...
Bucket not found: gitlab-mr-diffs. Skipping backup of external_diffs ...
Bucket not found: gitlab-terraform-state. Skipping backup of terraform_state ...
Bucket not found: gitlab-pages. Skipping backup of pages ...
Bucket not found: gitlab-ci-secure-files. Skipping backup of ci_secure_files ...
Packing up backup tar
WARNING: Module python-magic is not available. Guessing MIME types based on file extensions.
ERROR: S3 error: 403 (InvalidAccessKeyId): The AWS Access Key Id you provided does not exist in our records.
command terminated with exit code 77
if we use the s3tool awscli it obviously only calls AWS S3 which is not the correct domain.
What is the current bug behavior?
Internal server error on the UI, no error 500 logs in webservice pods visible anymore, before with global.appConfig.artifacts.proxy_download=true it would log these as errors.
What is the expected correct behavior?
Artifacts are uploaded, logs are downloaded correctly
Do you have some idea where we might be able to find relevant logs? KAS, Sidekiq and Webservice don't log anything useful, runner is the same. I checked var/log/gitlab/production.log and others there inside the KAS (or Sidekiq/Toolbox) pod, but couldn't find any error.