Skip to content
GitLab
Next
Projects Groups Snippets
  • /
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
  • Sign in / Register
  • GitLab GitLab
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
    • Locked Files
  • Issues 44,766
    • Issues 44,766
    • List
    • Boards
    • Service Desk
    • Milestones
    • Iterations
    • Requirements
  • Merge requests 1,330
    • Merge requests 1,330
  • CI/CD
    • CI/CD
    • Pipelines
    • Jobs
    • Schedules
    • Test Cases
  • Deployments
    • Deployments
    • Environments
    • Releases
  • Packages and registries
    • Packages and registries
    • Package Registry
    • Container Registry
    • Infrastructure Registry
  • Monitor
    • Monitor
    • Metrics
    • Incidents
  • Analytics
    • Analytics
    • Value stream
    • CI/CD
    • Code review
    • Insights
    • Issue
    • Repository
  • Snippets
    • Snippets
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
Collapse sidebar
  • GitLab.orgGitLab.org
  • GitLabGitLab
  • Issues
  • #215646
Closed
Open
Issue created Apr 24, 2020 by Kamil Trzciński@ayufan🔴Maintainer

Implement worker that `hard-deletes` old CI jobs metadata

Currently, we implemented the [Deprecate retention of build metadata older than 3 months (gitlab-foss#50939 (closed)).

This after a specified period allows us to disable ability for jobs to be retried. In such cases they are marked as archived and cannot be triggered. This does not prevent trigger the whole pipeline for a given commit, but this makes this pipeline to loose some metadata to be removed:

  • metadata of individual builds
  • DAG dependencies
  • etc.

The above is based on assumption, after some period you do not really care about being able to retry an individual job. You can still trigger the whole pipeline. Holding all these data is storage intensive as DB is not infinitely scalable.

This is the first step, as we implemented soft-delete (we disallow, but do not remove actual data).

Next step would be to start hard-delete of these data, after they hit a deadline. This allows us to remove a substantial amount of data from Database, and make the whole system significantly more healthy.

I would assume that we would introduce some worker that would walk the table and delete these entries. This worker likely would be executed once a month or even less frequently.

Edited Dec 03, 2020 by Alberto Ramos
Assignee
Assign to
Time tracking