gitaly issueshttps://gitlab.com/gitlab-org/gitaly/-/issues2024-01-01T02:03:53Zhttps://gitlab.com/gitlab-org/gitaly/-/issues/5497Add RPC to support bulk deletes for branches2024-01-01T02:03:53ZVasilii Iakliushinviakliushin@gitlab.comAdd RPC to support bulk deletes for branches### Problem
Reported in issue: https://gitlab.com/gitlab-org/gitlab/-/issues/420795+.
Rails uses `UserDeleteBranch` RPC in cycle when we delete multiple branches.
That leads to performance problems, for N branches we will have N Gita...### Problem
Reported in issue: https://gitlab.com/gitlab-org/gitlab/-/issues/420795+.
Rails uses `UserDeleteBranch` RPC in cycle when we delete multiple branches.
That leads to performance problems, for N branches we will have N Gitaly calls.
Also, we invoke server hooks N times instead of doing it only once.
### Proposal
Introduce an RPC to support bulk deletes for branches. Potentially, it can be applied to tags too.
[Suggestion from @pks-gitlab](https://gitlab.com/gitlab-org/gitlab/-/issues/420795#note_1504092229):
> If we want to go that way I would argue that we should generalize the interface to a `UserUpdateReferences` RPC with a similiar design as in our recently-introduced `UpdateReferences` RPC. Like this we could replace `UserCreateBranch`, `UserDeleteBranch` and `UserUpdateBranch` with a single RPC.https://gitlab.com/gitlab-org/gitaly/-/issues/5433Missing repository metadata about UserCommitFiles2024-03-25T03:25:52ZSteve Xuerebsxuereb@gitlab.comMissing repository metadata about UserCommitFiles## Overview
In the RPC logs for Gitaly we usually have fields like `grpc.request.glProjectPath` and `grpc.request.repoPath`
![Screenshot_2023-07-10_at_15.55.38](/uploads/efc6c8cfdd30eec336a62eec5dcccda8/Screenshot_2023-07-10_at_15.55.3...## Overview
In the RPC logs for Gitaly we usually have fields like `grpc.request.glProjectPath` and `grpc.request.repoPath`
![Screenshot_2023-07-10_at_15.55.38](/uploads/efc6c8cfdd30eec336a62eec5dcccda8/Screenshot_2023-07-10_at_15.55.38.png)
[source](https://log.gprd.gitlab.net/goto/7b64d530-1f29-11ee-a017-0d32180b1390)
However `UserCommitFiles` RPC doesn't seem to have this information at all
![Screenshot_2023-07-10_at_15.56.45](/uploads/2f82e83ebd8fab7f939e3f69f5c3b9d3/Screenshot_2023-07-10_at_15.56.45.png)
[source](https://log.gprd.gitlab.net/goto/a0dfb640-1f29-11ee-8afc-c9851e4645c0)16.11James Liujliu@gitlab.comJames Liujliu@gitlab.comhttps://gitlab.com/gitlab-org/gitaly/-/issues/4731Cache GetArchive RPC response using streamcache2024-03-25T03:25:53ZIgor Drozdovidrozdov@gitlab.comCache GetArchive RPC response using streamcacheCurrently, there's an option to [cache](https://gitlab.com/gitlab-org/gitlab/-/blob/a4798020782b37c7367091b367270dd80a66ebf3/workhorse/internal/git/archive.go#L71) archives in Workhorse. This option has been [disabled](https://gitlab.com...Currently, there's an option to [cache](https://gitlab.com/gitlab-org/gitlab/-/blob/a4798020782b37c7367091b367270dd80a66ebf3/workhorse/internal/git/archive.go#L71) archives in Workhorse. This option has been [disabled](https://gitlab.com/gitlab-org/gitlab-foss/-/merge_requests/16325) for Sass because [Workhorse processes have no shared or persistent storage](https://gitlab.com/gitlab-org/gitlab/-/issues/369437#note_1237388791). GitLab.com solves the archive cache problem (for public repos only) by using a CDN. However, archives are still cached in Workhorse for self-managed installations.
There's an [idea](https://gitlab.com/gitlab-org/gitlab/-/issues/369437#note_1237385518) to implement `GetArchive` RPC cache in Gitaly similar to the one we have for [pack-objects](https://gitlab.com/gitlab-org/gitaly/-/blob/4e47b5b3766375c6ac7a94cee742c9e9acca39b1/internal/streamcache/cache.go#L144) because Gitaly has persistent storage.16.11John Caijcai@gitlab.comJohn Caijcai@gitlab.comhttps://gitlab.com/gitlab-org/gitaly/-/issues/4619Remove Ruby specific special errors2022-11-14T15:42:31ZPavlo StrokovRemove Ruby specific special errorsWhile migrating old RPC implementation from Ruby sidecar to Go in some cases the specific errors were preserved as is to support the same behaviour. As there are no more Ruby RPCs we should drop these special cases. Probably it will requ...While migrating old RPC implementation from Ruby sidecar to Go in some cases the specific errors were preserved as is to support the same behaviour. As there are no more Ruby RPCs we should drop these special cases. Probably it will require changes on the GitLab side, so that should be done with caution.
Initial discussion about it started [here](https://gitlab.com/gitlab-org/gitaly/-/merge_requests/5029#note_1165898958).
/cc @pks-t @andrashorvathhttps://gitlab.com/gitlab-org/gitaly/-/issues/3496When refs are sorted by dates, order is incorrect2022-07-18T20:57:27ZSean CarrollWhen refs are sorted by dates, order is incorrectTaken from the [Merge Request](https://gitlab.com/gitlab-org/gitaly/-/merge_requests/2932#note_523185315):
<hr/>
There's buggy behavior:
When refs are sorted by committerdate and page token is passed, then weird results are returned. ...Taken from the [Merge Request](https://gitlab.com/gitlab-org/gitaly/-/merge_requests/2932#note_523185315):
<hr/>
There's buggy behavior:
When refs are sorted by committerdate and page token is passed, then weird results are returned. It works fine when refs are sorted by refname though.
`IsPageToken` defined in internal/gitaly/service/ref/refs.go is using `bytes.Compare` which compares refs alphabethically.
It works fine when refs are being sorted by refname, but when we sort by another field, `IsPageToken` would unexpectedly return the first result that is greater alphabetically.
For example, if refs have been created in the following order: `a` `c` `b` `d` and we pass `{ SortBy: 'updated_asc', PageToken: 'b', Limit: 1 }`, then we get the ref `b`, while the expected result is `d`.
We can use `bytes.HasPrefix` instead to fix this case
Related issue: https://gitlab.com/gitlab-org/gitlab/-/issues/321196
Feature flags:
- `branches_pagination_without_count`
- `branch_list_keyset_pagination`Backloghttps://gitlab.com/gitlab-org/gitaly/-/issues/3375Gitaly: Add batch version of CommitStats RPC2023-07-26T01:29:00ZAndy SchoenenGitaly: Add batch version of CommitStats RPC## Problem
I'm working on a service that sends branch data to Jira (https://gitlab.com/gitlab-org/gitlab/-/issues/263240). Currently, we sync branch data only on events, like when a new branch is created. But we also want to be able to...## Problem
I'm working on a service that sends branch data to Jira (https://gitlab.com/gitlab-org/gitlab/-/issues/263240). Currently, we sync branch data only on events, like when a new branch is created. But we also want to be able to sync multiple branches in one go when a new Project gets connected to Jira.
The data includes the number of files that were changed. We currently use the [CommitStats RPC](https://gitlab.com/gitlab-org/gitaly/-/blob/master/proto/commit.proto#L52) but we ran into an N+1 problem because it is not possible to batch load stats for multiple commits. I tried it in this MR: https://gitlab.com/gitlab-org/gitlab/-/merge_requests/48706#note_459086632.
## Possible solutions
Create a batch version of the CommitStats RPCBackloghttps://gitlab.com/gitlab-org/gitaly/-/issues/3374Gitaly: Add support for multiple commits (batch version) in diff RPC2023-07-26T01:28:35ZAndy SchoenenGitaly: Add support for multiple commits (batch version) in diff RPC## Problem
When we sync a commit to Jira, we currently [limit the diff files we sync to 10](https://gitlab.com/gitlab-org/gitlab/-/blob/314be41fd6341ae385505ef4c281532e0faedc67/lib/atlassian/jira_connect/serializers/commit_entity.rb#L33...## Problem
When we sync a commit to Jira, we currently [limit the diff files we sync to 10](https://gitlab.com/gitlab-org/gitlab/-/blob/314be41fd6341ae385505ef4c281532e0faedc67/lib/atlassian/jira_connect/serializers/commit_entity.rb#L33).
This is due to N+1 problems.
We currently use the [diffs RPC](https://gitlab.com/gitlab-org/gitaly/-/blob/master/proto/diff.proto#L12) to get those files, but we ran into an N+1 problem because it is not possible to batch load diffs for multiple commits. I tried it in this MR: https://gitlab.com/gitlab-org/gitlab/-/merge_requests/48706#note_459086632.
We only need the `path`, `change type` (added, deleted, moved, modified), `number of added lines` and `number of removed lines` ([lib/atlassian/jira_connect/serializers/file_entity.rb#L9-33](https://gitlab.com/gitlab-org/gitlab/-/blob/v13.6.1-ee/lib/atlassian/jira_connect/serializers/file_entity.rb#L9-33))
## Possible solutions
We can either support multiple commits in diffs RPC or create a new RPC that just returns the file paths, change_type and lines changed numbers for multiple commits.Backloghttps://gitlab.com/gitlab-org/gitaly/-/issues/3273"Validating diff contents" cancelled after 50 seconds when pushing new branch...2023-10-24T01:03:50ZKatrin Leinweber (GTLB)"Validating diff contents" cancelled after 50 seconds when pushing new branch with tens of thousands of objectsCreating a new branch in GitLab with `git push -u …GL…upstream… …local_branch…` can fail due to push rule problems (see https://gitlab.com/gitlab-org/gitlab/-/issues/9326 & https://gitlab.com/gitlab-org/gitlab-foss/-/issues/57067 for exa...Creating a new branch in GitLab with `git push -u …GL…upstream… …local_branch…` can fail due to push rule problems (see https://gitlab.com/gitlab-org/gitlab/-/issues/9326 & https://gitlab.com/gitlab-org/gitlab-foss/-/issues/57067 for example), in which case the observed behavior includes:
```
remote: Validating diff contents... (cancelled after …ca…50k…ms…)
…
error: failed to push some refs to 'git@gitlab.com:….git'
```
However, it can also fail if the pushed content is somehow large (ca. 8k commits & 30k files with many renames in this case). Additional output was observed with `git push -v`
```
Enumerating objects: …ca…120k…, done.
…
remote: Checking connectivity: …ca…85k…, done.
remote: GitLab: Internal API error (502)
…
! [remote rejected] …branch…name… -> …branch…name… (pre-receive hook declined)
```
(although the `hook declined` may be an unfitting error message in this case) and in `gitlab-workhorse/current`
```
{ …
"duration_ms": …ca…40k…,
"error": "badgateway: failed to receive response: EOF",
"level": "error",
"method": "POST",
…
"uri": "/api/v4/internal/allowed"
}
{ …
"duration_ms": …ca…40k…,
"host": "…",
"level": "info", # side note: This seems weird for a 502
"method": "POST",
…
"status": 502,
…
"uri": "/api/v4/internal/allowed",
"user_agent": "Go-http-client/1.1",
…
}
```
and in `gitaly/current`:
```
{
"duration_ms": …ca…40k…,
"error": "Internal API error (502)",
"level": "error",
"method": "POST",
"msg": "Internal API error",
…
"url": "http://…/api/v4/internal/allowed"
}
```
The branch in question seems significantly larger than "one of the largest representative MR[s]" ([internal chat](https://gitlab.slack.com/archives/C3JJET4Q6/p1604421168348500)) we use [in staging](https://staging.gitlab.com/gpt/large_projects/gitlabhq1/-/merge_requests/10495).
### Attempted solutions
- disabling all push rules (branch & file names) => not successful
- increasing [Gitaly timeouts](https://gitlab.com/gitlab-org/gitlab/-/blob/v13.2.6-ee/doc/user/admin_area/settings/gitaly_timeouts.md) => not successful
- increasing [the puma\['per\_worker\_max\_memory\_mb'\] setting](https://gitlab.com/gitlab-org/omnibus-gitlab/-/blame/13.2.6+ee.0/files/gitlab-config-template/gitlab.rb.template#L887) as pioneered by https://gitlab.com/gitlab-org/gitlab/-/issues/267499 => eliminated `PumaWorkerKiller: Out of memory` events that were seen before (environment: 1 Gitaly & 3 app nodes with 12 GB each), but not the here-reported issue
### Successful work-around
1. created a new personal project
1. pushed the branch into that
1. set it up as a fork of the main repo [via the API](https://gitlab.com/gitlab-org/gitlab/-/blob/v13.2.6-ee/doc/api/projects.md#fork-relationship).
1. created an MR & merged
### Attempted, more complicated work-around
1. `git log --oneline local_branch..upstream_merge_target > /tmp/local_commits.txt`
- the `HEAD~1234:` syntax didn't work, probably because of a too complicated branch history (see `Desired outcome` section below)
1. "bisecting" the branch log manually but picking a commit ID from the middle, around 25%, 10%, etc. to use in
1. `git push upstream …local…SHA…:…upstream…branch…`
1. repeating 3. to push the branch up in slices/chunks, so to speak.
### Not attempted
@zj-gitlab suggested to increase [puma['worker_timeout']](https://gitlab.com/gitlab-org/omnibus-gitlab/-/blob/13.2.6+ee.0/files/gitlab-config-template/gitlab.rb.template#L870), which was overlooked before (probably by me).
### Desired outcome
All this leads up to whether or how we can improve Gitaly and/our Puma performance for large/huge push events.
On the user-side, such event can most likely be avoided by pushing often. For example, when prep'ing a release branch from many topic branches => better push after each merge/rebase/cherry-pick ;-)
### Reported by
A large Premium customer ([internal ticket](https://gitlab.zendesk.com/agent/tickets/176243)).Backloghttps://gitlab.com/gitlab-org/gitaly/-/issues/3192Implement a streaming variant of the UserCommitFiles RPC2023-04-19T18:21:28ZMarkus KollerImplement a streaming variant of the UserCommitFiles RPC### Problem to solve
<!-- What problem do we solve? Try to define the who/what/why of the opportunity as a user story. For example, "As a (who), I want (what), so I can (why/value)." -->
Files that are uploaded into a Git repository vi...### Problem to solve
<!-- What problem do we solve? Try to define the who/what/why of the opportunity as a user story. For example, "As a (who), I want (what), so I can (why/value)." -->
Files that are uploaded into a Git repository via the `UserCommitFiles` Gitaly RPC are fully loaded into memory in both Rails and Gitaly-Ruby: https://gitlab.com/gitlab-org/gitlab/-/issues/200054
This wastes memory and can also lead to timeouts on large files.
### Proposal
<!-- How are we going to solve the problem? Try to include the user journey! https://about.gitlab.com/handbook/journeys/#user-journey -->
We should implement a streaming variant of this RPC and use it for all Git uploads.
Technical details TBD, also see @stanhu's initial investigation at https://gitlab.com/groups/gitlab-org/-/epics/4550#note_425300913.Backlog