Try remote caching
Closes #11 (closed).
The latest test job for master
took around 22 minutes. Out of which 8+ minutes were spent downloading and unpacking the GitLab CI cache and the same 8 minutes packing and uploading it. So 16 minutes to have the cache enabled. That means the build itself takes around 6 minutes using the downloaded cache.
Build without the CI cache takes 12 minutes, so... it's better without the cache?! Well, if we had the cache on disk and unpacked instantaneously, that would have been a 2x improvement (6 vs 12 minutes). But downloading and unpacking takes a lot of time. This is likely because our build has a lot of intermediate files (because we have a lot of dependencies) and because of that the cache is big both in terms of MB and in terms of number of files (37,000+), generated by the build.
This MR enables Bazel's remote caching.
First run
Duration: 33 minutes 48 seconds
https://gitlab.com/gitlab-org/cluster-integration/gitlab-agent/-/jobs/701790804
This is slow because the build populates the remote cache by uploading all the build artifacts, including the intermediate ones.
Second run - no code changes
Duration: 5 minutes 38 seconds
https://gitlab.com/gitlab-org/cluster-integration/gitlab-agent/-/jobs/701816523
This is much faster! Faster than the original 22 minute build.
Third run - no code changes, --remote_download_minimal
flag
Duration: 4 minutes 54 seconds
https://gitlab.com/gitlab-org/cluster-integration/gitlab-agent/-/jobs/701822277
This is even better. But if everything is cached, what's happening for 5 minutes?! It's downloading all the libraries we depend on. More on this in the FAQ. It's unfortunate, but we cannot do much about it at the moment.
p.s. To learn more about remote caching and remote execution in bazel watch this talk https://www.youtube.com/watch?v=MyuJRUwT5LI There are many other talks and blogs on this topic if you are curious.
p.p.s. Let's try a local build:
bazel test //...
INFO: Invocation ID: 4ac3b676-4686-40da-879d-0a529db036b4
INFO: Build option --test_env has changed, discarding analysis cache.
INFO: Analyzed 33 targets (654 packages loaded, 11719 targets configured).
INFO: Found 27 targets and 6 test targets...
INFO: Elapsed time: 3225.158s, Critical Path: 3163.48s
INFO: 623 processes: 623 darwin-sandbox.
INFO: Build completed successfully, 624 total actions
//internal/agentk:go_default_test PASSED in 6.9s
//internal/gitlab:go_default_test PASSED in 0.5s
//internal/kas:go_default_test PASSED in 0.5s
//internal/tools/testing/kube_testing:go_default_test PASSED in 0.8s
//internal/tools/wstunnel:go_default_test PASSED in 5.0s
//pkg/agentcfg:go_default_test PASSED in 0.4s
INFO: Build completed successfully, 624 total actions
3225.158s is 53 minutes. My internet connection from Sydney to the GCP bucket in US is struggling quite a bit. Also, it's just 100/40mbps down/up.
Let's try another time, now with fully populated remote and local cache:
bazel test //...
INFO: Invocation ID: 1d7ae98f-0c87-44da-bd06-033fd5ab1328
INFO: Build option --test_env has changed, discarding analysis cache.
INFO: Analyzed 33 targets (0 packages loaded, 11719 targets configured).
INFO: Found 27 targets and 6 test targets...
INFO: Elapsed time: 0.909s, Critical Path: 0.24s
INFO: 0 processes.
INFO: Build completed successfully, 1 total action
//internal/agentk:go_default_test (cached) PASSED in 6.9s
//internal/gitlab:go_default_test (cached) PASSED in 0.5s
//internal/kas:go_default_test (cached) PASSED in 0.5s
//internal/tools/testing/kube_testing:go_default_test (cached) PASSED in 0.8s
//internal/tools/wstunnel:go_default_test (cached) PASSED in 5.0s
//pkg/agentcfg:go_default_test (cached) PASSED in 0.4s
INFO: Build completed successfully, 1 total action
1 second.
Let's try another time, with clean local cache:
bazel test //...
INFO: Invocation ID: 84940def-0584-495d-a9bd-94a0831fc5cf
INFO: Build option --test_env has changed, discarding analysis cache.
INFO: Analyzed 33 targets (654 packages loaded, 11719 targets configured).
INFO: Found 27 targets and 6 test targets...
INFO: Elapsed time: 118.916s, Critical Path: 69.62s
INFO: 1298 processes: 1290 remote cache hit, 8 darwin-sandbox.
INFO: Build completed successfully, 1306 total actions
//internal/agentk:go_default_test (cached) PASSED in 1.2s
//internal/gitlab:go_default_test (cached) PASSED in 0.9s
//internal/kas:go_default_test (cached) PASSED in 1.0s
//internal/tools/testing/kube_testing:go_default_test (cached) PASSED in 0.8s
//internal/tools/wstunnel:go_default_test (cached) PASSED in 1.0s
//pkg/agentcfg:go_default_test (cached) PASSED in 1.3s
INFO: Build completed successfully, 1306 total actions
2 minutes, which is very good, taking into consideration my internet connection.