Sign in or sign up before continuing. Don't have an account yet? Register now to get started.
Follow-up from "SPP - Scan committed diffs instead of entire file"
### Overview
This issue is a follow-up to the work done in https://gitlab.com/gitlab-org/gitlab/-/issues/469161 to use `DiffBlobs()` RPC to scan diff changes for a particular git push operation instead of scanning entire files. With this change [in place](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/162782), we still need to address a few follow-ups to enhance the overall stability and performance of the feature.
Most importantly, we have to drop the use of `GetTreeEntries()` RPC to retrieve file paths and commit revision/sha for each finding in favour of utilising `ListAllCommits()` and `FindChangedPaths` RPCs (which we already use in the diff scanning approach) to retrieve that information.
### Proposal
*For Entire-file Scanning*
* Keep using `GetTreeEntries()` until entire-file-scanning is phased out and removed.
*For Diff-based Scanning*
* Update [secret push check](https://gitlab.com/gitlab-org/gitlab/-/blob/master/ee/lib/gitlab/checks/secrets_check.rb) to depend on `ListAllCommits()` and `FindChangedPaths()` instead of `GetTreeEntries()`.
### Original Discussions
The following discussions from !158984 should be addressed:
#### Remaining Follow-ups
- [x] @ahmed.hemdan started a [discussion](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/158984#note_2027671744):
> **suggestion (blocking)**: As discussed earlier, I think it's very important to drop `GetTreeEntries()` and retrieve file path and commit sha for a certain diff from `ListAllCommits()` and `FindChangedPath()` RPCs instead. This would basically save us a lot of calls to Gitaly and several git tree walks to load such data, and would generally improve the push check performance.
>
> We would have to reimplement most of the methods building up the message we display to users though, so I realize it's not a simple task. I would recommend we create a follow-up issue from this thread to tackle this improvement, and move forward with it as soon as this one is merged.
_**See this comment: https://gitlab.com/gitlab-org/gitlab/-/merge_requests/158984#note_2033624827 for more details.**_
---
#### Tackled Follow-ups
- [x] @ahmed.hemdan started a [discussion](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/158984#note_2033511976):
> That's suprising because the contents are identical.
>
> Shall we consider excluding `diff_blob` objects that have an equal `patch` field? :thinking:
- [x] @ahmed.hemdan started a [discussion](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/158984#note_2027671648):
> **suggestion (non-blocking)**: To reduce the cognitive complexity of this class, I imagine that at some point we will want to move the retrieval of diffs to a separate class, but it can be done in a follow-up. Let me know your thoughts!
_**Will be handled in https://gitlab.com/gitlab-org/gitlab/-/issues/481883.**_
- [x] @ahmed.hemdan started a [discussion](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/158984#note_2027671997):
> **question (non-blocking)**: As we no longer use `list_blobs`, I wonder if this one should be removed. Can you try running specs after removing this line and seeing if anything breaks?
_**No longer relevant.**_
- [x] @ahmed.hemdan started a [discussion](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/158984#note_2033511996):
> And update the code of `create_commit` method to not [delete the branch](https://gitlab.com/gitlab-org/gitlab/-/blob/f16d03d5fcaf87469554bbd04e3ce89edaea1767/ee/spec/support/shared_contexts/secrets_check_shared_contexts.rb#L274-276) (if you haven't already), so that the new commit is updating the same file from the initial commit.
_**Will be handled as part of https://gitlab.com/gitlab-org/gitlab/-/merge_requests/162146.**_
- [x] @vbhat161 started a [discussion](https://gitlab.com/gitlab-org/security-products/secret-detection/secret-detection-service/-/merge_requests/6#note_2035108556):
> Let's go ahead with `payloads` then :thumbsup:
>
> /cc @serenafang In case we still have time to accommodate this naming change in [your MR](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/158984 "SPP - Scan committed diffs instead of entire file")
_**Had already been handled in recent merge requests – see https://gitlab.com/gitlab-org/gitlab/-/merge_requests/168364.**_
issue