Spike: Identify solutions to support group level identifier filtering with PostGres
The aim of this spike is to capture the technical unknowns and identify solutions for implementing filter by identifier work on the group level using PostGres for the requirement listed in &13340 (comment 2218346097)
We know that our features don't work on large groups, so identify the limits for group level at which we can support identifier filtering using PostGres.
Primary technical limitation:
- The identifier filter is required to work alongside other filters. However, PostgreSQL does not support creating a composite index that combines a text search query with other
WHEREconditions to perform a index only scan. Our analysis on this limitation can be found here.- This means we are constrained to use the indices created for other filters and perform the identifier text search as the final node in the database query.
- To ensure acceptable performance, the in-memory filtering at the final stage must process a minimal number of tuples.
- DB maintainers feedback on the query performance for project level implementation: !171463 (comment 2203656049)
Unknowns:
- We require the following backend API's to scale on the limits for a limit like < 10k vulnerabilities that we identified for the project level:
-
2.1 Full search API with the filter arguments including pagination.
-
2.2 Dropdown API for substring search.
-
2.3 SeveritiesCount API grouped by severities to present counts top table on the report UI.
- Limiting the feature only to groups with a lower number of vulnerabilities count:
- 3.1 To address reliability issues, how do we limit the APIs/endpoints for the limited groups (aka allowed groups) alone. (For project level it was the opposite, the feature is expected to work for 99% of the projects with the default filters on
gitlab.com.) - 3.2 For group level, we do not have a vulnerability statistics table upfront so restricting usage based on groups vulnerability size would be a challenge.
Knowns:
- Distribution of groups bucketed into counts of vulnerabilities for the default filter having an identifier on
gitlab.com. See: #506480 (comment 2235763427)
Edited by Bala Kumar