CRC Checksums during Deploy Stage slow down merge trains (missing C extension)
NOTE: This fix is just a workaround until the root cause issue in the latest google/cloud-sdk
docker image is fixed. I (@cwoolley-gitlab) am following that issue and will revisit this when it is resolved
What is/are the relevant URLs or emails?
https://gitlab.com/gitlab-com/www-gitlab-com/-/jobs/510841115
Briefly describe the bug
https://gitlab.com/gitlab-com/www-gitlab-com/-/jobs/510841115
The job takes 13min+ to complete, and mostly hangs with the CRC checksums. There are several binary files in the repository whose checksums are always computed.
The C extension is missing so this will be a huge on-the-fly calculation then.
$ gcloud config set project $GCP_PROJECT
Updated property [core/project].
WARNING: You do not appear to have access to project [gitlab-production] or it does not exist.
$ gsutil -h "Cache-Control:public, max-age=600" -m rsync -c -d -r public/ gs://$GCP_BUCKET
WARNING: You have requested checksumming but your crcmod installation isn't
using the module's C extension, so checksumming will run very slowly. For help
installing the extension, please see "gsutil help crcmod".
Building synchronization state...
At destination listing 10000...
At source listing 10000...
Starting synchronization...
Computing CRC32C for file://public/direction/operations/operationsflow.gif...
Computing CRC32C for file://public/direction/personas/executives/executiveflow.gif...
Computing CRC32C for file://public/direction/security/securityflow.gif...
Copying file://public/direction/maturity/index.html [Content-Type=text/html]...
Deployment Time Impacts
This is also severely impacting deployment times on www-gitlab-com
.
You can see the spike in deployment times correlated with the beginning of this error on April 14th, as shown in #7233 (closed)
Here's the pipeline metrics for the spike:
Suggestions
- Install the C extension for gsutil
- Verify that the checksums are only rebuilt for new files (might need to keep an index cache)
- Monitor the job runtime (separate issue)
/cc @gl-website