Geo Primary Verification API: Add sort options with proper indexing for GET api/v4/admin/data_management/:model
Everyone can contribute. Help move this issue forward while earning points, leveling up and collecting rewards.
Why are we doing this work
This work will be behind feature flag geo_primary_verification_view
This is extracted from #537710 (closed) to focus specifically on adding sort functionality to the Geo Primary Verification API endpoint. We need to ensure proper database indexing and performance considerations for sorting operations on potentially large tables like Merge Request Diffs or CI Job Artifacts.
Ideally, we should use existing indexes and avoid adding new ones, as we have 17 affected tables, some being potentially very large. This issue aims at fostering the discussion on what we should, and/or can, do around this.
What is needed in this change
This change focuses on adding sort functionality to the API endpoint from #537707 (closed) with proper performance considerations:
Expected sort parameters
-
?sort-
id_desc|asc- Primary sorting option (recommended as per SQL guidelines) -
created_desc|asc- Do we really need this one? See⬆️ -
last_checksum_desc|asc- Only when checksumming is enabled, requires proper indexing analysis
-
Could be that last_updated_at is an alternative if available and there's an index, or if we need it and can take the performance hit.
**For the MVC, we'll restrict ourselves to sorting by primary key. **
Additional sort parameters will be added based on customer requests when we revisit the dashboard: this issue is moved to &16588.
Performance considerations
-
Avoid
created_atsorting: As noted in our SQL guidelines, we should avoid ordering bycreated_atfor performance reasons -
Index analysis needed: Current indexes on
verified_atmay not be sufficient forlast_checksumsorting - Large table impact: Tables like Merge Request Diffs or CI Job Artifacts can be huge, requiring careful query optimization
Relevant links
- Related issue: #537710 (closed)
- API Endpoint Definition: #537707 (closed)
- SQL Guidelines: https://docs.gitlab.com/development/sql/#ordering-by-creation-date-created_at
Implementation plan
- Service implementation: Add sorting logic to the service layer
- Performance testing: Validate query performance on large datasets
- Documentation: Update API documentation with supported sort options
Acceptance criteria
-
Sort by id_desc|ascis implemented and performant -
Sort by last_checksum_desc|ascis implemented with proper indexing (when checksumming enabled) if we decide to go ahead -
No created_atsorting option is provided (following SQL guidelines) if absolutely needed. -
Database indexes are used for the sorting operations -
Performance testing shows acceptable query times on large tables -
API documentation is updated with supported sort parameters