Change 1:N to 1:1 relation between Vulnerability and Vulnerabilities::Finding
Engineering DRI: @quintasan ## Status update on 2024-04-09 After a sync chat with @abellucci we concluded it's reasonable to *pause* the work after https://gitlab.com/groups/gitlab-org/-/epics/11030+ due the following topics: 1. https://gitlab.com/groups/gitlab-org/-/epics/3430+ is on the horizon (not for 17.X) and continuing with this work might provide suboptimal performance 2. Ongoing talks about using workitems for vulnerabilities (https://gitlab.com/gitlab-org/gitlab/-/issues/448282+ and https://gitlab.com/gitlab-org/gitlab/-/issues/454256+) We can revisit this decision once we hash out some details over at https://gitlab.com/gitlab-org/gitlab/-/issues/452190+ ## What? ~~I propose merging `vulnerability_occurrences` table with `vulnerabilities`~~ It turns out that we cannot merge both tables because unique constraints don't work across all partitions. As such, the lowest hanging fruit we can achieve here is actually creating a 1:1 to relation between those two tables. Right now we're pretending to have a 1:1 to relation by doing `findings.first` but this constraint is not actually reflected in the database ## Why? 1. **It's a thing of the past** - one-to-many relation between `Vulnerability` and `Vulnerabilities::Finding` objects is a thing of the past as seen by the comment in [Vulnerability model](https://gitlab.com/gitlab-org/gitlab/-/blob/fb96d3a4c9c490005da1f4c4de6f12c5f2756f96/ee/app/models/ee/vulnerability.rb#L177) 1. **It's a weak guarantee of 1:1 relationship** - see table below, if there's a 1:1 to relationship then we should have the same number or records. We don't have that. Fortunately the latest orphaned `Vulnerabilities::Finding` was created at `2022-03-07 12:33:33.589893+00` which more than a year ago. 1. **It's a performance bottleneck** - both tables are huge (see table below) and we're frequently performing joins between those two tables despite there being a 1:1 relation between them 1. ~~**It makes our database harder to grok** - if we merge both tables, we will be left with two major entities in our database - `Vulnerability` records that represent vulnerabilities that, at some point, made it to the default pipeline as well as `Security::Finding` which represent vulnerabilities that have been found between ALL branch scans.~~ 1. ~~**One more table to partition** - by merging it, we don't have to partition it.~~ | vulnerabilities | vulnerability_occurrences | | ------ | ------ | | 74 121 790 | 75 343 856 | ## Why not? 1. **It's an expensive migration** - we would spend a lot of time not only on the database work but we would have to adjust all application logic to not rely on the `Vulnerabilities::Finding` model. 1. **We will need to adjust big chunks of our features** - if we are to get rid of the `Vulnerabilities::Finding` model then we will have to adjust our code not to rely on it. This will be a huge undertaking. 1. **`vulnerabilities` table will get new columns** - we will need to figure out which columns from `vulnerabilities_occurrences` to keep and which we can drop in order to limit the column growth ## High level plan 1. Add `finding_id` foreign key to `vulnerabilities` table 2. Backfill it based on the `vulnerability_occurrences.vulnerability_id` table **Now we have a 1:1 relationship (except it's duplicated on the other side).** We can pause here and work on other things and improve performance by replacing the JOIN queries 3. At this point we don't have to perform a JOIN operation, we can, instead do 2 SELECT queries 4. Drop `vulnerability_occurrences.vulnerability_id` foreign key **Now we have a proper 1:1 relationship.** We can pause here and work on other things and improve performance by replacing the JOIN queries. 5. Determine which columns from `vulnerability_occurrences` we want to keep. 6. Create those columns in `vulnerabilities` table. 7. Update logic to populate them in the `Vulnerability` model when creating a `Vulnerabilities::Finding` records 8. Create a batched background migration to backfill those columns for older rows **Now, we will be populating both tables with the necessary data**. We can pause here but not for long, lest we fill up the entire disk :sweat_smile: 9. Adjust the logic in our services to not rely on `Vulnerabilities::Finding`. Put it behind a feature flag 10. Enable it on a test project, find any bugs, fix them 11. Rinse and repeat until we're comfortable enough to enable it on `gitlab-org/gitlab` 12. Rinse and repeat until we're comfortable enough to enable it globally 13. Drop the `vulnerability_occurrences` table.
epic