Right now there are 2 N + 1 gitaly endpoints in one controller. Futher, its not thought through to have both active and stale as sections. Just to detect an active branch we might loop through all branches but master and walk the tree for 1000 commits each!
The branches index test does over 100 Gitaly request, which is hurting the page performance and Gitaly.
@jramsay when can this be scheduled? Could you squeeze it in V11.2?
I think it's useful to be able to access a list of active (these are probably the ones I want to work on) or stale branches (these are probably the ones I want to delete).
@DouweM can we cache some of this data so it doesn't have to hit Gitaly for all the data on every page load? Also, I'm not sure why we need the most recent 1,000 commits from each branch.
Also, I'm not sure why we need the most recent 1,000 commits from each branch.
We don't return the commits, but we do check if the branch is up to 1000 commits old, else we bail. it might be an idea to just check when the last commit was made and use that to determine if a branch is active, or stale.
We don't return the commits, but we do check if the branch is up to 1000 commits old, else we bail. it might be an idea to just check when the last commit was made and use that to determine if a branch is active, or stale.
@zj Isn't that exactly what we're doing in Gitlab::Git::Branch#active?? Where are you seeing the code that reads up to 1000 commits?
#active uses #dereferenced_target, which is prepopulated when we get the list of branches from Gitaly, and doesn't trigger a new commit lookup.
@DouweM You're right. I was sloppy and just checked the Gitaly requests. However, after we do have the stale branches, we do check the number of commits the branch is behind:
I have to note that this is cached, but given Redis is configured as LRU cache and the fact that these calls aren't batched seems not so nice either.
@jramsay Did a quick test, and the overhead of making one call is sub ms. This means that even if this is a n + 1 call, this page will load about 10-20ms slower with the n + 1 calls. I think the real problem with this page timing out is the widget, and not the N + 1 calls.
I think the real problem with this page timing out is the widget
@zj by this you are referring to checking every branch possibly twice to generate the active and stale sections? I want to make sure I understand which part of the page is causing the problems. Sorry if I'm being slow.
@zj Can you please clarify what you think is the main issue on this page, and how we could go about fixing it?
@jramsay I'm putting a weight of 5 because of the uncertainty, if we get a clearer idea of what resolving this would look like, I can probably dial it down a bit. By the way, did you mean to assign yourself?
@zj we've got multiple other performance and N+1 issues scheduled for %11.4 so I'll push this back to %11.5 while we get a clearer idea of how to resolve this.
@jramsay@DouweM I think we should first schedule and implement gitaly#888 (closed), as that might improve performance. Than revisit this issue. I don't think we will close this issue afterwards, but we can better set the priority.
We can consider sending an offset and a limit to Gitaly, so that it won't return data we will not display anyway. Gitaly would need to be able to filter by active/stale. We can replace the page number based pagination with simple Prev/Next pagination so that we don't need the total count, just a "there is more" boolean.
All the information we need in this view (ID, title, committed date) is already in the RPC response, so we should be able to use this instead of looking up the commits separately. Alternatively, we can use Repository#commits_by(oid:) to request all of these commits in one go.
We just need the branch names, but the RPC used returns a lot more information that we then throw away. We could add a new RPC, or we could consider adding a merged boolean to the result of the RPC used by Gitlab::GitalyClient::RefService#local_branches, although we'd need to verify that this doesn't negatively affect other uses of this RPC that don't need this information. Related to https://gitlab.com/gitlab-org/gitlab-ce/issues/56665.
@zj-gitlab As part of the performance issue with gitlab#208738 (closed), I'm looking to implementing the first solution noted above as the first step. This will improve both end points so I think this is a good candidate to start with.
Is this something that Gitaly team would like to implement or should I looking into the gitaly code as well?
I'm not a Go expert, but I have dabble with gitlab-shell a bit before so I can start looking into this if that's preferred.
Given the timeframe of the issue I'm looking at, it may not be possible for gitaly team to jump in straight away so I'm happy start looking into it and consult gitaly team if I get stuck.
Please let me know what you think.
@dskim_gitlab I was planning on implementing gitaly#2704 (closed) this week, where I'd specifically look at a generic pagination pattern for Gitaly. For the local branches I was scanning through the code and it wasn't as easy I at first expected it to be. I did poke at it: gitaly!2251 (merged), hope that's fine.
If you were looking forward to working in Go, happy to find another issue to implement something fun.
Thanks a lot for that @zj-gitlab! What a timing 👍
I'll test it out with branch API when that MR gets merged.
However, it doesn't seem to support filtering by name or active/stale as far as I can see. I don't need the active/stale for the branch API, but filtering by a search term would be needed to take the full advantage of this pagination.
Would that be something you would be looking at after this? I can take a look at that next week if that makes sense to you?
I'm interested in getting my hands on Go a bit more when I get a chance, but I'll need to see how I go since I've got my hands full currently. 😄
GitLab is moving all development for both GitLab Community Edition
and Enterprise Edition into a single codebase. The current
gitlab-ce repository will become a read-only mirror, without any
proprietary code. All development is moved to the current
gitlab-ee repository, which we will rename to just gitlab in the
coming weeks. As part of this migration, issues will be moved to the
current gitlab-ee project.
If you have any questions about all of this, please ask them in our
dedicated FAQ issue.
Using "gitlab" and "gitlab-ce" would be confusing, so we decided to
rename gitlab-ce to gitlab-foss to make the purpose of this FOSS
repository more clear
I created a merge requests for CE, and this got closed. What do I
need to do?
Everything in the ee/ directory is proprietary. Everything else is
free and open source software. If your merge request does not change
anything in the ee/ directory, the process of contributing changes
is the same as when using the gitlab-ce repository.
Will you accept merge requests on the gitlab-ce/gitlab-foss project
after it has been renamed?
No. Merge requests submitted to this project will be closed automatically.
Will I still be able to view old issues and merge requests in
gitlab-ce/gitlab-foss?
Yes.
How will this affect users of GitLab CE using Omnibus?
No changes will be necessary, as the packages built remain the same.
How will this affect users of GitLab CE that build from source?
Once the project has been renamed, you will need to change your Git
remotes to use this new URL. GitLab will take care of redirecting Git
operations so there is no hard deadline, but we recommend doing this
as soon as the projects have been renamed.
Where can I see a timeline of the remaining steps?