Improve maintainability of Job Artifacts
Problem
Today, Ci::JobArtifact
is very intimidating model that there are number of complex consts/enum/array. In fact, you can observe:
- 16 file types (e.g.
archive
,metadata
) - 3 compression types (e.g.
zip
,gzip
) - So many consts/scopes
*_TYPES
,scope :xxx_reports
,DEFAULT_FILE_NAMES
,INTERNAL_TYPES
,REPORT_TYPES
,TEST_REPORT_FILE_TYPES
... and more. It's hard to track where we should update. Slightly violating SSOT.- Each file type is categorized by one of report/internal(?)
- Each file type is categorized by one of feature categories.
- Each file type has an unique default file name.
- One file type is categorized as non-erasable.
- There are many Active Record
scopes
for each category.
I think we can manage this architecture better with a more structural/organized way.
Proposal
- We introduce a YAML file as a central place to organize categories and attributes.
- The first key is one of file types, which is corresponding to
Ci::JobArtifact.file_types
. - The second key
file_type:description:
explains the usage of the artifact. - The second key
file_type:compression:
indicates the compression algorithm of the artifact, which is corresponding toCi::JobArtifact.file_formats
. - The second key
file_type:format:
indicates the file format of the raw/gzip file. - The second key
file_type:category:
indicates the category of the artifact. - The second key
file_type:default_file_name:
indicates that the default file name to be persisted in storage. - and we can add more keys on demand.
e.g.
# ci/artifact_type.yml
archive:
description: The files produced by the build script.
compression: zip
category: general_artifact
metadata:
description: The metadata of `archive` type artifact.
compression: gzip
category: general_artifact
trace:
description: The stdout/stderr of build script.
compression: raw
format: txt
category: trace
erasable: false
junit:
description: JUnit test report
compression: gzip
format: xml
category: test_report
default_file_name: junit.xml
codequality:
description: Codequality report
compression: gzip
format: json
category: unknown
default_file_name: gl-code-quality-report.json
sast:
description: Static Application Security Testing
compression: gzip
format: json
category: security_report
default_file_name: gl-sast-report.json
performance:
description: ...
compression: raw
format: json
category: unknown
default_file_name: performance.json
license_management:
description: ...
compression: gzip
format: txt
category: license_scanning
default_file_name: metrics.txt
dependency_scanning:
description: ...
compression: gzip
category: dependency_scanning
default_file_name: metrics.txt
lsif:
description: LSIF data for code navigation
compression: gzip
format: json
category: code_navigation
default_file_name: lsif.json
dotenv:
description: The variables generated in user script.
compression: gzip
default_file_name: build.env
category: variables
With that, we do:
- Provide AR
scope
s based on this semantics. for example,Ci::JobArtifact.category_security_report
queries the rows only annotated ascategory: security_report
, which is equivalent toCi::JobArtifact.security_reports
scope. - Provide convenient interfaces to return the list of keys/values. e.g.
Ci::JobArtifact.default_file_names
should return the key/value pair by file_type and default_file_name, which is equivalent toDEFAULT_FILE_NAMES
const. - Provide convenient interfaces to return the annotation by the
file_type
. e.g.job_artifact_dotenv.default_file_name
should returnbuild.env
. - Remove
*_TYPES
,scope :xxx_reports
,DEFAULT_FILE_NAMES
,INTERNAL_TYPES
,REPORT_TYPES
,TEST_REPORT_FILE_TYPES
fromCi::JobArtifact
.