Next steps for the Mirroring feature
Starting in 9.4.1 we can already see a huge improvement in the reliability of the mirroring feature:
- Capacity stopped going below 0
ApplicationSetting.current.mirror_max_capacity
=> 20
Gitlab::Redis.with { |redis| redis.smembers('MIRROR_PULL_CAPACITY').length }
=> 19
This means https://gitlab.com/gitlab-org/gitlab-ee/merge_requests/2460 is doing it's job as expected.
Gitlab::Redis.with { |redis| redis.smembers('MIRROR_PULL_CAPACITY') }.each { |proj| prj = Project.unscoped.find_by_id(proj.to_i) ; if prj.nil? ; puts "Project #{proj} is deleted" ; else puts "Project #{proj} - #{prj.import_status}" end }
Project 3767628 - started
Project 3779371 - failed
Project 3780981 - failed
Project 3781591 - scheduled
Project 3781596 - scheduled
Project 3781602 - scheduled
Project 3781626 - scheduled
Project 3781934 - scheduled
Project 3782029 - scheduled
Project 3782984 - scheduled
Project 3783006 - scheduled
Project 3783041 - started
Project 3783044 - started
Project 3783147 - scheduled
Project 3783228 - scheduled
Project 3783314 - started
Project 3783439 - started
Project 3785733 - scheduled
=> ["3767628", "3779371", "3780981", "3781591", "3781596", "3781602", "3781626", "3781934", "3782029", "3782984", "3783006", "3783041", "3783044", "3783147", "3783228", "3783314", "3783439", "3785733"]
There are still some failed
projects inside the capacity but from further investigation:
> Project.find_by(id: 3780981).mirror?
=> false
> Project.find_by(id: 3780981).pending_delete?
=> false
Which leads me to believe this is connected to importing/mirroring entanglement.
Another thing we are seeing is the huge queue due to the small capacity we have in Gitlab.com currently:
> Project.mirror.joins(:mirror_data).where("next_execution_timestamp <= ? AND import_status NOT IN ('scheduled', 'started')", Time.now).order_by(:next_execution_timestamp).count
=> 24068
This means two things:
- We need to increase the capacity from 20. My suggestion would be to rollback to the previous default value (150)
- We need https://gitlab.com/gitlab-org/gitlab-ee/merge_requests/2366 released in order to make sure everyone gets his turn to run. Right now if a mirror takes one second it will almost pass the entire queue.
We also need to start looking into ways of removing the entanglement and technical debt we are seeing in all this code. A discussion is already happening here to do this: https://gitlab.com/gitlab-org/gitlab-ee/issues/2954 feel free to chime in
These are all good indicators that the feature has a promising future. We should keep monitoring it to make sure everything is working as expected.
\cc @DouweM @pcarranza @stanhu