Build an API to integrate Suggested Reviewer with GitLab
## 🧩 Problem to solve Based on the [solution validation](https://gitlab.com/gitlab-org/ux-research/-/issues/1664#note_807649941) the UI team has done for Suggested Reviewer (SR), we need to create an API to be able to send the SR artefact output to the UI. The artefacts are stored in the MR and will be stored in the Data storage we are currently building. This issue is mainly to understand/ discuss the steps and tasks we would need to be able to integrate with the GitLab code review experience. ## 💡 Proposal ### :envelope: Estimation Let's do a rough estimation to understand the scale of the system and the potential challenges we will need to address. - $`40.6K`$ merge requests created in recommender-service~3364772 on ~"GitLab Ultimate" in June 2022 ([source](https://app.periscopedata.com/app/gitlab/786738/Code-Review-MAU-Metrics?widget=11289926&udv=0)) - Assume that each merge request creation will trigger an SR call ```math \frac{40,6000}{\text{ 30 days in a month}} = 1,353\text{ SR request per day} ``` - SR query per second (QPS) ```math \text{SR QPS} = \frac{1,353}{10^5 \text{ seconds in a day}} = 0.01353 ``` - Assume peak SR QPS is 10 times the average number. ```math \text{Peak QPS} = 0.1353 ``` _Please note that besides merge request creation, new commits also change the nature of the merge request and hence would potentially trigger re-evaluation of the suggested reviewers. They are not factored in this calculation at this point due to the lack of data._ ### :spider_web: High-level design I propose to create a new gRPC interface in the bot service. #### Fetch suggested reviewers on commit push events ```plantuml skinparam shadowing false skinparam ParticipantPadding 60 skinparam BoxPadding 60 box "GitLab" participant "worker" as B participant "client" as G database DB as DB end box box "Suggested Reviewer" participant "bot service" as RB end box [o-> B: push event activate B B -> G: execute activate B activate G G -> RB: gRPC { project, changes } activate RB RB -> RB: fetch project's feature sets RB -> RB: fetch model RB -> RB: execute inference RB --> G: JSON payload deactivate RB G --> B: suggestions deactivate B deactivate G B -> DB: store suggestions activate B activate DB DB --> B deactivate DB deactivate B [<-- B: done deactivate B ``` - Enqueue a background job to fetch suggested reviewers when there is a committed push to a merge request - GitLab forwards gRPC calls to the suggested reviewer service - Save the suggestions to the GitLab PostgreSQL database, ie. `merge_request_reviewer_suggestions` table #### Query suggestions stored in GitLab DB ```plantuml skinparam shadowing false skinparam ParticipantPadding 60 skinparam BoxPadding 60 box "GitLab" participant "gitlab frontend" as U participant "gitlab backend" as G database DB as DB end box [o-> U: async load activate U U -> G: GraphQL { merge_request } activate G G -> G: check availability G -> G: check cache G -> DB: fetch suggestions activate DB DB --> G: suggestions deactivate DB G --> U: suggestions deactivate G [<-- U: render deactivate U ``` - Async GraphQL endpoint between UI components with GitLab backend - GitLab checks for availability, including feature flag, licenses and suggested reviewer settings - If allowed, fetch the suggestions from the GitLab PostgreSQL database, ie. `merge_request_reviewer_suggestions` table ### 🛠 Implementation plan #### Must have | Issue | Skillset/Groups Needed | Engineer(s) Assigned | Status | Milestone | Notes | | -------------- | ----------------- | ------ | ------ | ------ | ------ | | [Introduce a feature flag to control rolling out Suggested Reviewer API/UI](https://gitlab.com/gitlab-org/gitlab/-/issues/368341) | Ruby |@tle_gitlab | Completed | 15.3 | https://gitlab.com/gitlab-org/gitlab/-/merge_requests/92847| | ~~[Support configuring TLS for Suggested Reviewer](https://gitlab.com/gitlab-org/modelops/applied-ml/review-recommender/recommender-bot-service/-/issues/25)~~ | K8s | @tle_gitlab | Not required | 15.4 | Server-TLS is sufficient | | ~~Configure network connection between GitLab and Suggested Reviewer~~ | Terraform | @tle_gitlab | Not required | 15.3 | Both endpoints are already public-facing | | [Create v2 API of Suggested Reviewer](https://gitlab.com/gitlab-org/modelops/applied-ml/review-recommender/recommender-bot-service/-/issues/31) | Golang | @AndrasHerczeg | Completed | 15.4 | | | [Build Suggested Reviewer client in GitLab](https://gitlab.com/gitlab-org/gitlab/-/issues/368679) | Ruby | @a_akgun | Completed | 15.4 | https://gitlab.com/gitlab-org/gitlab/-/merge_requests/93634 | | [Create GraphQL endpoint for suggested reviewers](https://gitlab.com/gitlab-org/gitlab/-/issues/369591) | Ruby | @a_akgun | Completed | 15.4 | https://gitlab.com/gitlab-org/gitlab/-/merge_requests/96771 | | [Configure gRPC with TLS on Suggested Reviewer](https://gitlab.com/gitlab-org/modelops/applied-ml/review-recommender/cluster-management/-/issues/2) | Ruby | @tle_gitlab | Completed | 15.4| https://gitlab.com/gitlab-org/modelops/applied-ml/review-recommender/cluster-management/-/merge_requests/13 | | [Implement TLS on Suggested Reviewer request](https://gitlab.com/gitlab-org/modelops/applied-ml/review-recommender/recommender-bot-service/-/issues/19) | Ruby | @tle_gitlab | Completed | 15.4| https://gitlab.com/gitlab-org/gitlab/-/merge_requests/97946 | | [Deploy authentication secret on GitLab](https://gitlab.com/gitlab-com/gl-infra/reliability/-/issues/16283) | K8s | @tle_gitlab<br>~"Infrastructure Scalability" | Completed | 15.4| | | Deploy authentication secret on Suggested Reviewer | K8s | @tle_gitlab | Completed | 15.4| | | [Implement authorisation token on Suggested Reviewer request](https://gitlab.com/gitlab-org/modelops/applied-ml/review-recommender/recommender-bot-service/-/issues/19) | Ruby, Golang | @tle_gitlab<br>@AndrasHerczeg | Completed | 15.4| | | ~~Limit the number of suggestion calls per session/MR~~ | Ruby| | Not required | 15.4| | | Implement retry strategy on Suggested Review requests |Ruby | @a_akgun | Completed |15.4 | https://gitlab.com/gitlab-org/gitlab/-/issues/373173 | | Implement suggestion cache on GitLab |Ruby | @a_akgun | Completed |15.4 | https://gitlab.com/gitlab-org/gitlab/-/merge_requests/97622 | | ~~Handle Suggested Reviewer timeout and unavailability on GitLab~~ | Ruby| | Not required |15.4| | #### Nice to have | Issue | Skillset/Groups Needed | Engineer(s) Assigned | Status | Milestone | Notes | | -------------- | ----------------- | ------ | ------ | ------ | ------ | | [Export protocol definition to a Ruby gem](https://gitlab.com/gitlab-org/modelops/applied-ml/review-recommender/recommender-bot-service/-/issues/23) | Golang, Ruby | | | 15.8 | | | Implement telemetry on suggestion flow | Ruby, JavaScript | || 15.6 | | ## 🤔 Considerations For posterity, this section records other alternatives that we have discovered in this issue. <details> We will start with `Option 3` and move to `Option 1` in later iterations. ### Option 1: New gRPC interface with Suggested Reviewer service - Remove the API calls to GitLab API from `recommender-bot-service` - Send a gRPC call with the project and merge request changes to `recommender-bot-service` - Could be limited if we would like to extract more features in the future <details> <summary>Sequence Diagram</summary> ```plantuml skinparam shadowing false skinparam ParticipantPadding 10 skinparam BoxPadding 20 participant "gitlab frontend" as U participant "gitlab backend" as G box "Suggested Reviewer" participant "bot service" as RB end box [o-> U: load activate U U -> G: HTTP { merge_quest } \n(async) activate G G -> G: check availability G -> G: check cache activate G #DarkSalmon G -> RB: gRPC\n { project, merge_request_changes } activate RB RB -> RB: get latest project model\n from DB RB --> G: JSON payload deactivate RB deactivate G #DarkSalmon G --> U: suggestions deactivate G [<-- U: render deactivate U ``` </details> <details> <summary>Implementation plan</summary> - recommender-service~2492649 Create a new gRPC endpoint in `recommender-bot-service` to accept a list of MR changes - recommender-service~2492649 Export protocol definition to a Ruby gem (similar to how Gitaly [does](https://gitlab.com/gitlab-org/gitaly/tree/master/proto)) - recommender-service~2492649 Introduce a feature flag to control rolling out Suggested Reviewer API/UI - recommender-service~2492649 Build a client in GitLab (similar to [GitalyClient](https://gitlab.com/gitlab-org/gitlab/blob/master/lib/gitlab/gitaly_client.rb)) - Validate feature flag and setting - Limit the number of calls per session/MR - Include the authorisation token in the request - Record relevant system metrics - Retry with exponential back-off - Cache (key uses checksum on merge request changes) - Timeout </details> ### Option 2: HTTP interface with Envoy proxy - Utilise the existing Envoy HTTP interface - RESTful interface allows easier integration with other services in the future - Introduce additional transcoding latency <details> <summary>Sequence Diagram</summary> ```plantuml skinparam shadowing false participant "gitlab frontend" as U participant "gitlab backend" as G box "Suggested Reviewer" participant "envoy" as E participant "bot service" as RB end box participant "gitlab API" as A [o-> U: load activate U U -> G: HTTP { merge_quest } \n(async) activate G G -> G: check availability G -> G: check cache activate G #DarkSalmon G -> E: HTTP { project, merge_quest } activate E E -> RB: transcoded gRPC activate RB RB -> A: identify MR RB -> A: get MR changes RB -> RB: select latest project model\n from DB RB --> E: JSON payload deactivate RB E --> G: suggestions deactivate E deactivate G #DarkSalmon G --> U: suggestions deactivate G [<-- U: render deactivate U ``` </details> <details> <summary>Implementation plan</summary> - recommender-service~2492649 Introduce a feature flag to control rolling out Suggested Reviewer API/UI - recommender-service~2492649 Build a client in GitLab - Validate feature flag and setting - Limit the number of calls per session/MR - Include the authorisation token in the request - Record relevant system metrics - Retry with exponential back-off - Cache (key uses checksum on merge request changes) - Timeout </details> ### Option 3: Existing gRPC interface with Suggested Reviewer service - By-pass the existing Envoy HTTP interface - Remove additional transcoding latency - Require management of protocol definition in Ruby gem <details> <summary>Sequence Diagram</summary> ```plantuml skinparam shadowing false participant "gitlab frontend" as U participant "gitlab backend" as G box "Suggested Reviewer" participant "bot service" as RB end box participant "gitlab API" as A [o-> U: load activate U U -> G: HTTP { merge_request } \n(async) activate G G -> G: check availability G -> G: check cache activate G #DarkSalmon G -> RB: gRPC { project, merge_request } activate RB RB -> A: identify MR RB -> A: get MR changes RB -> RB: get latest project model\n from DB RB --> G: JSON payload deactivate RB deactivate G #DarkSalmon G --> U: suggestions deactivate G [<-- U: render deactivate U ``` </details> <details> <summary>Implementation plan</summary> - recommender-service~2492649 Export protocol definition to a Ruby gem (similar to how Gitaly [does](https://gitlab.com/gitlab-org/gitaly/tree/master/proto)) - recommender-service~2492649 Introduce a feature flag to control rolling out Suggested Reviewer API/UI - recommender-service~2492649 Implement TLS to secure calls from GitLab to Suggested Reviewer - recommender-service~2492649 Configure network connection between GitLab clusters and Suggested Reviewer cluster - recommender-service~2492649 Build a client in GitLab (similar to [GitalyClient](https://gitlab.com/gitlab-org/gitlab/blob/master/lib/gitlab/gitaly_client.rb)) - Validate feature flag and setting - Limit the number of calls per session/MR - Include the authorization token in the request - Record relevant system metrics - Retry with exponential back-off - Cache - Timeout </details> </details> ## :notebook: References - Example Suggested Reviewers clients [repository](https://gitlab.com/gitlab-org/modelops/applied-ml/review-recommender/suggested-reviewer-client)
epic