Create a retention policy for job logs

Everyone can contribute. Help move this issue forward while earning points, leveling up and collecting rewards.

Close this issue

What did we learn?

Results
📋 Summary Respondents primarily access job logs for debugging failed jobs, with 83% of respondents indicating this reason. 54% of respondents require access to job logs for up to 30 days. The most common challenge faced in managing artifacts is storage. Storage is affected by the number of artifacts generated from pipeline runs, not the individual artifact file sizes. Users expressed a need for improved job log interfaces, including timestamps and auto-collapsing sections for better usability.
💡 Suggestions A mechanism to bulk delete job logs after a designated retention period would significantly free up storage space. Here are some ideas: For job logs generate in the future - Automated process Provide an option for users to opt into automated removal, running cleanup at specified intervals. Add job logs to the existing `expire_in` keyword to set a designated retention period. For job logs generated in the past - Bulk deletion and archive Provide a "delete all" button for job logs from a specific day forward. Enable bulk deletion through the API. Develop a better search and multi-select mechanism in the UI for easier management of job logs. Implement auto-collapsing for script sections in the job log to help users navigate more easily when there is a lot of output.
💬 Detailed Findings and Respondents Background Primary Workflows for Job Logs: The primary reason for accessing job logs is to debug when jobs fail (10 responses). Other reasons include seeing test results, checking build speeds, ensuring reproducibility, auditing specific tests, and assessing project health (1 response each). Duration of Job Log Access: 54% of respondents (7 people) need access to job logs for 30 days. Other durations include (1 response each): Custom retention policy One week Indefinite for tagged releases, 30 days for per-commit builds 3 months 6 months 1 respondent was unsure about the required access duration. Challenges in Managing Artifact Storage: Storage is affected by many teams running numerous pipelines. Challenges are more related to Docker images rather than artifacts/logs. Issues with cleanup policies from runners and specific retention policies were highlighted. Additional Needs and Improvements: Respondents look forward to having timestamps on job logs as a default feature. The feature flag `FF_SCRIPT_SECTIONS` has significantly improved the user experience and is suggested to become default. Respondent suggests an interface with auto-collapsing sections (e.g. script sections from the bottom) for better usability. Compliance Requirements: 54% (7 respondents) work under internal requirements. 38% (5 respondents) need to comply with GDPR. 15% (2 respondents each) are either not working with any of the listed requirements, unsure, or complying with PCI-DSS and FDA standards. Background: This research is part of https://gitlab.com/gitlab-org/ux-research/-/issues/2971+, conducted in May 2024. It targeted Dev team leads, platform engineers and software developers from a mix of SMB and enterprise-size customers. The goal was to determine specific use cases for restricting download access to artifacts and to understand how job logs are used and how long they need to be retained. The survey ran for a month, yielding 12 valid responses. Of these, 53% were from technology companies involved in creating pipelines, writing code, and running tests. Additionally, 54% of respondents use Self-Managed GitLab, while 23% use GitLab.com.
This research is part of https://gitlab.com/gitlab-org/ux-research/-/issues/2971+. Link to Qualtrics project

Proposal

Following up from Determine if Job Logs count towards artifact st... (#373917 - closed) we have determined:

Job logs are part of the artifacts storage and
Jobs logs do not currently have a retention policy (no expiration) and are difficult to manage

As part of &8715 we need to add functionality to enable customers visibility and user-friendly options to view and delete these job artifacts and traces. We currently have one API allowing for deletion: https://docs.gitlab.com/ee/api/jobs.html#erase-a-job.

There are multiple potential solutions. This issue addresses the first item only.

Introduce a retention policy which expires job logs after a certain period (i.e. 7 days) for auto deletion.
Add functionality for search/sort/removal as part of the new Artifacts page. This solution is being addressed separately in Delete an artifact on artifacts page (#370150 - closed) and Browse artifacts of all types from artifacts page (#370151) and Ability to filter and batch delete artifacts di... (&1974 - closed).
Updated API for bulk removal action. This solution is detailed in Introduce additional parameters to bulk delete ... (#14495) and will be implemented in Frontend: Artifacts bulk delete: make "select a... (#396792 - closed)

Create a retention policy for job logs

What did we learn?

📋 Summary

💡 Suggestions

Proposal

Links