Use tag_list from job definition, if available
What does this MR do and why?
This MR implements the first step of migrating tag_list usage from legacy tables (ci_build_tags join table and tags table) to the new job definitions model, making job definitions the single source of truth (SSoT) for job tags, while falling back to legacy tables if the specified job's immutable details haven't yet been migrated to a job definition.
Background
As part of the CI/CD data sharding initiative, we are moving CI job metadata from legacy tables into the ci_job_definitions table. In Phase 2 (#565405), we started storing tag_list in job definitions. This MR switches the application to read tags from job definitions instead of the legacy ci_tags and ci_build_tags tables, whenever the job definition is available.
Changes
1. Override tag_list method in Ci::Build:
- Reads
tag_listfromjob_definition.config[:tag_list]when available - Falls back to the legacy
tagsassociation whenjob_definitionis nil - This ensures backward compatibility during the migration period
2. Add deprecation warning to tag_list= setter:
- Added
ActiveSupport::Deprecationwarning whentag_list=is called - The setter is only used during record creation and should not be used after
- Deprecation target: GitLab 19.0
- This helps identify any code that incorrectly tries to update tags after job creation
3. Update eager loading:
- Modified
eager_load_tagsscope to include:job_definitionassociation - Ensures efficient queries when accessing tags through job definitions
4. Comprehensive test coverage:
- Tests for reading tags from job definition
- Tests for fallback to legacy tags when job definition is absent
- Tests for empty tag lists in job definition
- Tests for eager loading behavior
Migration Path
This MR is part of a multi-phase migration:
-
✅ Phase 1: Addtag_listcolumn to job definitions table -
✅ Phase 2: Start storing tags in job definitions for new jobs -
🔄 Phase 3 (this MR): Switch application to read from job definitions -
📋 Phase 4: Upserttag_listintotagstable -
📋 Phase 5: Backfill existing jobs with tags from legacy tables -
📋 Phase 6: Drop legacyci_build_tagstable
Benefits
- Single source of truth: Tags are now consistently read from job definitions
- Sharding ready: Job definitions are designed for horizontal sharding
- Performance: Reduces joins to legacy tables once migration is complete
- Data consistency: Eliminates potential discrepancies between legacy tables and job definitions
References
Closes #569119
Related to:
- Epic: &11837 (closed) (CI/CD Data Sharding)
- Phase 2: #565405 (closed)
Screenshots or screen recordings
N/A - Backend code change
How to set up and validate locally
-
Run the updated tests:
bundle exec rspec spec/models/ci/build_spec.rb -e "tag_list" -
Test eager loading:
# Verify no N+1 queries when accessing tags builds = Ci::Build.eager_load_tags.limit(10) # This should trigger a single query on p_ci_job_definition_instances/p_ci_job_definitions builds.each { |b| puts b.tag_list } -
Check deprecation warnings:
# This should trigger a deprecation warning build = Ci::Build.first build.tag_list = ['new_tag'] # DEPRECATION WARNING: tag_list= is deprecated...
MR acceptance checklist
Evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.
-
Performance: Eager loading updated to include job_definition -
Reliability: Maintains backward compatibility with fallback to legacy tags -
Security: No security implications -
Maintainability: Simplifies data model by establishing job definitions as SSoT; deprecation warnings guide future development -
Testing: Comprehensive test coverage for all scenarios including edge cases -
Database: Part of planned migration to drop legacy tables; no immediate database changes