Geo Primary Verification API: Add sort options with proper indexing for GET api/v4/admin/data_management/:model
Everyone can contribute. Help move this issue forward while earning points, leveling up and collecting rewards.
Why are we doing this work
This work will be behind feature flag geo_primary_verification_view
This is extracted from #537710 (closed) to focus specifically on adding sort functionality to the Geo Primary Verification API endpoint. We need to ensure proper database indexing and performance considerations for sorting operations on potentially large tables like Merge Request Diffs or CI Job Artifacts.
Ideally, we should use existing indexes and avoid adding new ones, as we have 17 affected tables, some being potentially very large. This issue aims at fostering the discussion on what we should, and/or can, do around this.
What is needed in this change
This change focuses on adding sort functionality to the API endpoint from #537707 (closed) with proper performance considerations:
Expected sort parameters
-
?sort
-
id_desc|asc
- Primary sorting option (recommended as per SQL guidelines) -
created_desc|asc
- Do we really need this one? See⬆️ -
last_checksum_desc|asc
- Only when checksumming is enabled, requires proper indexing analysis
-
Could be that last_updated_at
is an alternative if available and there's an index, or if we need it and can take the performance hit.
Performance considerations
-
Avoid
created_at
sorting: As noted in our SQL guidelines, we should avoid ordering bycreated_at
for performance reasons -
Index analysis needed: Current indexes on
verified_at
may not be sufficient forlast_checksum
sorting - Large table impact: Tables like Merge Request Diffs or CI Job Artifacts can be huge, requiring careful query optimization
Relevant links
- Related issue: #537710 (closed)
- API Endpoint Definition: #537707 (closed)
- SQL Guidelines: https://docs.gitlab.com/development/sql/#ordering-by-creation-date-created_at
Implementation plan
- Service implementation: Add sorting logic to the service layer
- Performance testing: Validate query performance on large datasets
- Documentation: Update API documentation with supported sort options
Acceptance criteria
-
Sort by id_desc|asc
is implemented and performant -
Sort by last_checksum_desc|asc
is implemented with proper indexing (when checksumming enabled) if we decide to go ahead -
No created_at
sorting option is provided (following SQL guidelines) if absolutely needed. -
Database indexes are used for the sorting operations -
Performance testing shows acceptable query times on large tables -
API documentation is updated with supported sort parameters