Introduce GCS adapter for remote cache
What does this MR do?
Adds GCS support for caching mechanism
Why was this MR needed?
GCS doesn't work well with minio-go library, because of differences in S3 protocol implementations. This MR adds support for GCS using the GCS own API to create presigned URLs for uploading and downloading.
Are there points in the code the reviewer needs to double check?
Does this MR meet the acceptance criteria?
-
Documentation created/updated: http://review-docs-intro-xleo88.178.62.207.141.xip.io/runner/configuration/advanced-configuration.html#the-runners-cache-section -
Added tests for this feature/bug -
In case of conflicts with master
- branch was rebased
What are the relevant issue numbers?
Closes #1773 (closed)
Merge request reports
Activity
Hey @tmaczukin, is it possible to design it in a way that
[runners.cache.gcs]
section would contain just path to the json credentials file, instead of specifying the private key in the open? That would heavily simplify the deployments, as all what's needed would be just pass a properGOOGLE_APPLICATION_CREDENTIALS
env var to runner.Hey @ilyaf, thanks for the comment. I see a point of using the credentials file directly. But I still like to have possibility to specify credentials directly in the
config.toml
file - it's how we're supporting S3 and in some cases it's easier to have only one file to manage.I'll se if I'll be able to add support for both cases in a clean way :)
I'll be awesome if you can support both! for context, we have secret management uncoupled from the runners configuration, and being able to rotate secrets w/o having to parse them and add private key to runner config would be very nice for us (and probably not only for us).
I'm
this thread and volunteer to test the gcs implementation in our setup, thanks for working on this!@nolith Could you review this? Regarding recent problems with cache servers for our Shared Runners (https://gitlab.com/gitlab-com/infrastructure/issues/4565#note_91514068) I'd really like to test this on production soon. GCS should be much more reliable than our own servers with
minio
on board, and it would additionaly help us safe some costs.As for changes in
vendor/
(which may look scary) these are minimal updates requied to get google API library working.Edited by Tomasz Maczukinassigned to @nolith
@tmaczukin 5 degradations in codequality.
assigned to @tmaczukin
- Resolved by Tomasz Maczukin
- Resolved by Tomasz Maczukin
@ilyaf Are you still interested in testing this? If yes, then please wait for the pipeline to become green. After this you can download the binary/package from https://gitlab-runner-downloads.s3.amazonaws.com/introduce-gcs-cache-support/index.html. And the documentation preview is at http://review-docs-intro-xleo88.178.62.207.141.xip.io/runner/configuration/advanced-configuration.html#the-runners-cache-gcs-section.
If you will need a docker image (don't know how your setup looks like), then let me know - I'll restart the pipeline with option to store the image at https://gitlab.com/gitlab-org/gitlab-runner/container_registry
added 9 commits
-
ad20934a...f9a9b5f4 - 2 commits from branch
master
- 689da775 - Introduce modular structure for shells/cache
- ca011e7f - Add GCS cache adapter
- 75b42ca3 - Update dependencies
- b39893a0 - Support GCS credentials from file
- 0608ac13 - Update documentation
- 08af3482 - Refactor cache.Adapter interface
- dbf3795a - Print explicitly that there is no URL and cache will be not uploaded/downloaded
Toggle commit list-
ad20934a...f9a9b5f4 - 2 commits from branch
added 20 commits
-
dbf3795a...80396350 - 11 commits from branch
master
- 88bb6c7b - Introduce modular structure for shells/cache
- 41bf1e92 - Add GCS cache adapter
- ca8cc042 - Update dependencies
- f55656a6 - Support GCS credentials from file
- 77c02181 - Update documentation
- 41252bca - Refactor cache.Adapter interface
- c0915ff5 - Print explicitly that there is no URL and cache will be not uploaded/downloaded
- 087d67db - Add BucketName explicitly to gcs config struct
- c0ba6ccc - Create dedicated section for S3 cache configuration
Toggle commit list-
dbf3795a...80396350 - 11 commits from branch
- Resolved by Alessio Caiazza
OK, added changes that I wanted.
I've introduced a
[runners.cache.s3]
configuration section. This will make things consistent with freshly introduced GCS configuratn, and it will make the configuration cleaner in the future (which will be especially useful, if we'll introduce another cache adapter).I've made current configuration format (with S3 options directly in
[runners.cache]
) deprecated. After we'll merge this MR, we should create an issue, pointed to %12.0, that will contain the cleanup part.I'm removing the WIP status - the MR may be reviewed again
changed milestone to %11.2
mentioned in issue #1773 (closed)
added 1 commit
- eeb5e66f - Create dedicated section for S3 cache configuration
added 1 commit
- 6e21a483 - Create dedicated section for S3 cache configuration
- Resolved by Alessio Caiazza
- Resolved by Tomasz Maczukin
@axil Could you take a look on documentation changes added in this MR? :)
- Resolved by Tomasz Maczukin
- Resolved by Tomasz Maczukin
- Resolved by Tomasz Maczukin
- Resolved by Tomasz Maczukin
- Resolved by Tomasz Maczukin
- Resolved by Tomasz Maczukin
@tmaczukin thanks! I pushed some edits for the docs :)
added devopsverify missed:11.2 labels
changed milestone to %11.3
added 6 commits
- 9b1be095 - Refactor cache.Adapter interface
- f26723b0 - Print explicitly that there is no URL and cache will be not uploaded/downloaded
- cd715a7c - Add BucketName explicitly to gcs config struct
- e36e08af - Create dedicated section for S3 cache configuration
- 8f4e80eb - Copyedit docs for GCS cache support
- cd422efd - Refactorize structure and interface of ./shells/cache/...
Toggle commit listadded 20 commits
-
cd422efd...edd7ee23 - 9 commits from branch
master
- 3b5368c0 - Introduce modular structure for shells/cache
- ec528347 - Add GCS cache adapter
- 330905f9 - Update dependencies
- d3480624 - Support GCS credentials from file
- 7fa65eff - Update documentation
- 11773a8b - Refactor cache.Adapter interface
- e9f1efac - Print explicitly that there is no URL and cache will be not uploaded/downloaded
- b6b6fa4f - Add BucketName explicitly to gcs config struct
- e027c5ec - Create dedicated section for S3 cache configuration
- b4a5f937 - Copyedit docs for GCS cache support
- 6b8f471e - Refactorize structure and interface of ./shells/cache/...
Toggle commit list-
cd422efd...edd7ee23 - 9 commits from branch
added 20 commits
-
6b8f471e...fb161916 - 9 commits from branch
master
- 147a7b3b - Introduce modular structure for shells/cache
- 3943b641 - Add GCS cache adapter
- cbd13b1b - Update dependencies
- fc5cd3c9 - Support GCS credentials from file
- 24e4375b - Update documentation
- 190f35c5 - Refactor cache.Adapter interface
- 9c4c03ea - Print explicitly that there is no URL and cache will be not uploaded/downloaded
- 4451eb1c - Add BucketName explicitly to gcs config struct
- 75656b81 - Create dedicated section for S3 cache configuration
- fa209417 - Copyedit docs for GCS cache support
- bbd835c5 - Refactorize structure and interface of ./shells/cache/...
Toggle commit list-
6b8f471e...fb161916 - 9 commits from branch
added 1 commit
- 61d62b4b - Refactorize structure and interface of ./shells/cache/...
@tmaczukin whats the status of this MR? Are we still on track for %11.3
added 40 commits
-
61d62b4b...d06a9aa3 - 28 commits from branch
master
- 2f2abb41 - Introduce modular structure for shells/cache
- 357bb53c - Add GCS cache adapter
- d346e749 - Update dependencies
- 94e35637 - Support GCS credentials from file
- df4ebcab - Update documentation
- 23180186 - Refactor cache.Adapter interface
- fc8017ea - Print explicitly that there is no URL and cache will be not uploaded/downloaded
- 0a0ba2d5 - Add BucketName explicitly to gcs config struct
- 90bf4022 - Create dedicated section for S3 cache configuration
- 58f31801 - Copyedit docs for GCS cache support
- d690ff41 - Refactorize structure and interface of ./shells/cache/...
- 32df4c77 - Move cache package out of shell package
Toggle commit list-
61d62b4b...d06a9aa3 - 28 commits from branch
added 11 commits
- c7a142d7 - Add GCS cache adapter
- e7e9a98f - Update dependencies
- d2c8c108 - Support GCS credentials from file
- cf641c71 - Update documentation
- 864e4fdd - Refactor cache.Adapter interface
- 643ae12d - Print explicitly that there is no URL and cache will be not uploaded/downloaded
- ac0b0bc9 - Add BucketName explicitly to gcs config struct
- 3cce6df0 - Create dedicated section for S3 cache configuration
- 3ad5c2bd - Copyedit docs for GCS cache support
- 4b48d8d7 - Refactorize structure and interface of ./shells/cache/...
- cf6a9b9c - Move cache package out of shell package
Toggle commit listadded 10 commits
- 649d98e6 - Support GCS credentials from file
- 26475160 - Update documentation
- 3f46641d - Refactor cache.Adapter interface
- 263d4592 - Print explicitly that there is no URL and cache will be not uploaded/downloaded
- 37ebf586 - Add BucketName explicitly to gcs config struct
- 0b2dff30 - Create dedicated section for S3 cache configuration
- a2e74780 - Copyedit docs for GCS cache support
- 9f550be2 - Refactorize structure and interface of ./shells/cache/...
- 6b30c15f - Move cache package out of shell package
- 682cd202 - Update dependencies
Toggle commit list@nolith It's ready for a next review!
Few words of comments - while I can agree that some things I've proposed initially could be simplified I definitely can't agree with a statement that testability is not important.
First, a possibility to write unit tests that are indeed testing the feature, not the feature's dependencies, from a long time is understood as a prove that the code has a good shape. But what's more important I remember many problems in the past caused because we:
- didn't have tests (because it was too hard or even impossible to write them),
- were testing not the things that we should (relaying on dependencies and their output instead of checking the behavior of our code).
One of the reasons of why we still are finding many different problem that are hard to track is that we have currently only a ~50% of code coverage (https://gitlab.com/gitlab-org/gitlab-runner/-/jobs/91089797) - which is drastically small - and that we have a lot of the code messed together without any interface separation. Adding a new cache adapter was a good reason to refactorize the whole cache mechanism and make it easier to maintain in the future, and while during this I wanted to not introduce problems mentioned above.
Having all of the above in mind - including your comments - I've updated the MR and:
- moved some things around to put them in better places,
- hidden things that should be internal and are not a part of the interfaces of cache mechanism nor cache adapters,
- hidden mechanisms used for making the code more testable (so we will not have any global, public, mutable state that could be changed somewhere else than in tests happening internally in the package).
What's also important in the new state of this MR is that I've moved the whole
cache
package and its subpackages out from theshell
package. The fact that we're currently using this only fromshell
with the presigned URL feature doesn't mean that cache mechanism is part of ashell
.assigned to @nolith
added 8 commits
- 00a21a8f - Refactorize structure and interface of ./shells/cache/...
- 2cee2f6f - Move cache package out of shell package
- f075e88f - Update dependencies
- 42bf8742 - Update docs with proper version for GCS support
- 4ed29e7d - Refactorize cache code
- 006b194f - Refactorize S3 cache adapter
- 2e9364fd - Improve cache/adapter tests
- 12265ca6 - Improve GCS adapter tests
Toggle commit listHow good we have https://gitlab.com/gitlab-org/gitlab-runner/-/jobs/92303707 now
I've fixed code style offenses.
- Resolved by Alessio Caiazza
- Resolved by Alessio Caiazza
- Resolved by Alessio Caiazza
- Resolved by Alessio Caiazza
- Resolved by Alessio Caiazza
- Resolved by Alessio Caiazza
- Resolved by Alessio Caiazza
- Resolved by Alessio Caiazza
- Resolved by Alessio Caiazza
- Resolved by Alessio Caiazza
- Resolved by Alessio Caiazza
Can we consider reducing the package name depth?
- I think that
cache/adapter
can be merged intocache
- adapter initialization should be done in
main.go
like we are already doing for executors (this will also remove the need to add a comment to make the linter happy) - instead of using
cache/adapters/provider
use justcache/provider
- I think that
- Resolved by Alessio Caiazza
- Resolved by Alessio Caiazza
assigned to @tmaczukin
@tmaczukin I've left some comments, please do not mark them as resolved, having them around will help me reviewing your changes.
mentioned in merge request !336 (closed)
@nolith All comments addressed. Pipeline still passes green
assigned to @nolith
- Resolved by Alessio Caiazza
- Resolved by Alessio Caiazza
- Resolved by Alessio Caiazza
- Resolved by Alessio Caiazza
- Resolved by Alessio Caiazza
- Resolved by Alessio Caiazza
mentioned in issue #3473 (closed)
Thanks @tmaczukin, great job! I've left a couple of comments.
assigned to @tmaczukin
added 19 commits
- 6f8b522c - Support GCS credentials from file
- 3c1e5666 - Update documentation
- fa6db101 - Refactor cache.Adapter interface
- 289affb9 - Print explicitly that there is no URL and cache will be not uploaded/downloaded
- d568792a - Add BucketName explicitly to gcs config struct
- 3beda3fe - Create dedicated section for S3 cache configuration
- a1efd003 - Copyedit docs for GCS cache support
- a0872574 - Refactorize structure and interface of ./shells/cache/...
- bc30159c - Move cache package out of shell package
- 6a048bf1 - Update dependencies
- c33d089f - Update docs with proper version for GCS support
- 805262e0 - Refactorize cache code
- f032534c - Refactorize S3 cache adapter
- e6bca4f8 - Improve cache/adapter tests
- b5fd0d0a - Improve GCS adapter tests
- 6a4811ec - Improve naming, slightly refactor tests
- 32aa86c6 - Change logrus mocking in tests
- cde5fa67 - Reduce depth of cache package
- e49e28e6 - Raplace syscall.Unlink with os.Remove
Toggle commit listadded 14 commits
- 16aab0cf - Create dedicated section for S3 cache configuration
- 8d8600a2 - Copyedit docs for GCS cache support
- 7c1c10dc - Refactorize structure and interface of ./shells/cache/...
- 3f4e2460 - Move cache package out of shell package
- f604694f - Update dependencies
- 0eae36b2 - Update docs with proper version for GCS support
- d9bef9aa - Refactorize cache code
- 7d49aebf - Refactorize S3 cache adapter
- b205a3ca - Improve cache/adapter tests
- 2fa431a6 - Improve GCS adapter tests
- ef68565d - Improve naming, slightly refactor tests
- b1268e4a - Change logrus mocking in tests
- 2f440bc2 - Reduce depth of cache package
- a6b95027 - Raplace syscall.Unlink with os.Remove
Toggle commit list@nolith Updated as requested :)
assigned to @nolith
Thanks @tmaczukin
it LGTMenabled an automatic merge when the pipeline for a6b95027 succeeds
mentioned in commit 0177466a
mentioned in merge request !1012 (merged)
mentioned in merge request !1014 (closed)
mentioned in issue #3573 (closed)
mentioned in issue #3450 (closed)
mentioned in commit 6fa38409
mentioned in merge request !1140 (merged)
mentioned in commit smuething/gitlab-runner@666aff5d
mentioned in issue #4070 (closed)
mentioned in epic &1238 (closed)
mentioned in issue #6556 (closed)
mentioned in issue #5018 (closed)