Address vulnerability reads owasp_top_10 storing only the oldest year version mapping
Problem
When a vulnerability has two owasp identifiers for 2017 and 2021, we present the vulnerability only under the 2017 category.
This is because currently vulnerability_reads.owasp_top_10 can store only one value and it is limited by the schema design as it is using Active Record Enum and it can map and store only a scalar value for a record and does not support array.
Background: During the initial technical discussion for schema design, we did not realise that a single vulnerability can belong to multiple owasp_top_10 mapped enum values. This senario is possible when a vulnerability has multiple year mappings like 2017 and 2021. See example image in: #438561 (closed) where a semgrep analyzer rule is tagging a single vulnerability under both the years 2017 and 2021.
Note: GitLab native analyzers are yet to use the dual mappings and we should likely notice more reports about this bug after %16.10 when they release multiple mappings. See: &10970 (comment 1636001776) and #438561 (comment 1731832516)
Proposals:
The options for us are:
- Do not support an array. Always map a vulnerability with the latest year when they have multiple year identifiers. The implementation for this approach is easy compared to the other approach.
2) Map a vulnerability under multiple years, so that a vulnerability shows in each of the years that they belong to. This requires some backend schema changes and is significantly more effort compared to option 1.
Implementation details:
Option 1
Modify the injection logic to map the vulnerability with the latest year alone.
- Modify ingestion logic for
vulnerability_readsto populate with 2021 mappings. This means we will have to show 2021 group_by option in the UI and stop showing 2017, we need a deprecation annoucement for this step. - Data migration to update
vulnerability_readsrecords that have missed 2021 categories. At this step if a vulnerability has 2017 and 2021 mappings we will override 2017 with 2021 mapping. - Data migration to remove the
owasp_top_10column value for vulnerabilities which has only 2017 categories.
Option 2
1. Create a new column of smallint array type
2. Modify the ingestion logic of vulnerability_reads table to populate both old and new column.
3. Create Backfill migration to populate to the new column.
4. After migration is complete, change the Vulnerability Reads finder to read from the new column. Verify things are working fine and then proceed with the cleanup steps below.
5. Remove the multiple population from the injection logic we did in step 2.
6. Remove the old column.