Use tag_list from job definition, if available

What does this MR do and why?

This MR implements the first step of migrating tag_list usage from legacy tables (ci_build_tags join table and tags table) to the new job definitions model, making job definitions the single source of truth (SSoT) for job tags, while falling back to legacy tables if the specified job's immutable details haven't yet been migrated to a job definition.

Background

As part of the CI/CD data sharding initiative, we are moving CI job metadata from legacy tables into the ci_job_definitions table. In Phase 2 (#565405), we started storing tag_list in job definitions. This MR switches the application to read tags from job definitions instead of the legacy ci_tags and ci_build_tags tables, whenever the job definition is available.

Changes

1. Override tag_list method in Ci::Build:

  • Reads tag_list from job_definition.config[:tag_list] when available
  • Falls back to the legacy tags association when job_definition is nil
  • This ensures backward compatibility during the migration period

2. Add deprecation warning to tag_list= setter:

  • Added ActiveSupport::Deprecation warning when tag_list= is called
  • The setter is only used during record creation and should not be used after
  • Deprecation target: GitLab 19.0
  • This helps identify any code that incorrectly tries to update tags after job creation

3. Update eager loading:

  • Modified eager_load_tags scope to include :job_definition association
  • Ensures efficient queries when accessing tags through job definitions

4. Comprehensive test coverage:

  • Tests for reading tags from job definition
  • Tests for fallback to legacy tags when job definition is absent
  • Tests for empty tag lists in job definition
  • Tests for eager loading behavior

Migration Path

This MR is part of a multi-phase migration:

  1. Phase 1: Add tag_list column to job definitions table
  2. Phase 2: Start storing tags in job definitions for new jobs
  3. 🔄 Phase 3 (this MR): Switch application to read from job definitions
  4. 📋 Phase 4: Upsert tag_list into tags table
  5. 📋 Phase 5: Backfill existing jobs with tags from legacy tables
  6. 📋 Phase 6: Drop legacy ci_build_tags table

Benefits

  1. Single source of truth: Tags are now consistently read from job definitions
  2. Sharding ready: Job definitions are designed for horizontal sharding
  3. Performance: Reduces joins to legacy tables once migration is complete
  4. Data consistency: Eliminates potential discrepancies between legacy tables and job definitions

References

Closes #569119

Related to:

Screenshots or screen recordings

N/A - Backend code change

How to set up and validate locally

  1. Run the updated tests:

    bundle exec rspec spec/models/ci/build_spec.rb -e "tag_list"
  2. Test eager loading:

    # Verify no N+1 queries when accessing tags
    builds = Ci::Build.eager_load_tags.limit(10)
    
    # This should trigger a single query on p_ci_job_definition_instances/p_ci_job_definitions
    builds.each { |b| puts b.tag_list }
  3. Check deprecation warnings:

    # This should trigger a deprecation warning
    build = Ci::Build.first
    build.tag_list = ['new_tag']  # DEPRECATION WARNING: tag_list= is deprecated...

MR acceptance checklist

Evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.

  • Performance: Eager loading updated to include job_definition
  • Reliability: Maintains backward compatibility with fallback to legacy tags
  • Security: No security implications
  • Maintainability: Simplifies data model by establishing job definitions as SSoT; deprecation warnings guide future development
  • Testing: Comprehensive test coverage for all scenarios including edge cases
  • Database: Part of planned migration to drop legacy tables; no immediate database changes
Edited by Pedro Pombeiro

Merge request reports

Loading