Add SBOM ingestion service
Why are we doing this work
This issue provides the implementation plan for a service component for storing sbom data in the database.
Relevant links
This is part of a wider epic to add SBOM ingestion: &8024 (closed)
Non-functional requirements
-
Documentation: tbd -
Feature flag: tbd -
Performance: -
Ensure that this service remains performant with large reports (10,000+ components)
-
-
Testing: tbd
Implementation Plan
We will make use of BulkInsertableTask
in order to create records in batches.
Create Sbom::Ingestion::IngestReportsService
- Take
Gitlab::Ci::Reports::Sbom::Reports
as an argument (added via #366194 (closed)) - Create an
Sbom::Ingestion::ComponentMap
andSbom::Ingestion::ComponentMapCollection
- A component map holds a set of related models (pipeline, source, component, version, occurrence)
- A component map collection holds a set of component maps
- Models are initialized from a
Reports
object - When performing insertions, models will be mapped over in order to collect data for subsequent insertions (foreign keys)
- Step through the following ingestions tasks. Each task will get data from the report object, convert it into the a model, and perform a bulk insert or upsert. Bulk upserts will skip over items which already have a row matching the
unique_by
fields. This lets us efficiently performcreate if not exists
operations.-
IngestSbomComponents
, Bulk upsert,unique_by: :name
-
IngestSbomComponentVersions
, Bulk upsert,unique_by: :component_id, :version
-
IngestSbomSources
, Bulk upsert,unique_by: :fingerprint
-
IngestSbomOccurrences
, Bulk insert
-
Verification steps
-
Enable the feature flag on your project (please comment / tick this in the feature flag rollout issue)
-
Create a new project from a template, use the NodeJS/Express template. Make sure that the group which you create this project under has an ultimate plan.
-
Create a
.gitlab-ci.yml
file with this configuration:include: - template: Security/Dependency-Scanning.gitlab-ci.yml gemnasium-dependency_scanning: # Needed until https://gitlab.com/gitlab-org/gitlab/-/merge_requests/99126 is on production artifacts: reports: cyclonedx: "**/gl-sbom-*.cdx.json"
-
Verify that the
gemnasium-dependency_scanning
outputs agl-sbom-npm-npm.cdx.json
artifact -
Use teleport to connect to the db console. For production, this will require an access request, but this will be automatically sent to the
#infrastructure-lounge
channel when you runtsh login
and is usually approved very quickly. -
Take this query an paste it inside an editor. Replace
YOUR_PIPELINE_ID
with the ID of your CI pipeline which produced thegl-sbom-npm-npm.cdx.json
artifact.select name, version, component_type, source_id from sbom_components inner join sbom_component_versions on sbom_components.id = sbom_component_versions.component_id inner join sbom_occurrences on sbom_component_versions.id = sbom_occurrences.component_version_id where pipeline_id = YOUR_PIPELINE_ID;
-
Paste the query with your pipeline ID into the database console. Verify expected output.
Example data
name | version | component_type | source_id
----------------------------------------------------+------------------------------------+----------------+-----------
github.com/astaxie/beego | v1.10.0 | 0 | 1
github.com/davecgh/go-spew | v1.1.1 | 0 | 1
github.com/konsorten/go-windows-terminal-sequences | v1.0.1 | 0 | 1
github.com/minio/minio | v0.0.0-20180419184637-5a16671f721f | 0 | 1
github.com/minio/minio-go | v6.0.14 | 0 | 1
github.com/minio/sha256-simd | v0.1.1 | 0 | 1
github.com/pmezard/go-difflib | v1.0.0 | 0 | 1
github.com/sirupsen/logrus | v1.4.2 | 0 | 1
github.com/stretchr/objx | v0.1.1 | 0 | 1
github.com/stretchr/testify | v1.2.2 | 0 | 1
golang.org/x/sys | v0.0.0-20190422165155-953cdadca894 | 0 | 1
golang.org/x/sys | v0.0.0-20191026070338-33540a1f6037 | 0 | 1
gopkg.in/check.v1 | v0.0.0-20161208181325-20d25e280405 | 0 | 1
gopkg.in/fake-package | v0.0.0-20161208181325-20d25e280405 | 0 | 1
gopkg.in/yaml.v2 | v2.2.2 | 0 | 1
abab | 2.0.3 | 0 | 2
acorn | 5.7.3 | 0 | 2
acorn | 6.4.0 | 0 | 2
acorn-globals | 4.3.4 | 0 | 2
acorn-walk | 6.2.0 | 0 | 2
ajv | 6.10.2 | 0 | 2
align-text | 0.1.4 | 0 | 2
amdefine | 1.0.1 | 0 | 2
ansi-regex | 2.1.1 | 0 | 2
ansi-regex | 3.0.0 | 0 | 2
ansi-styles | 2.2.1 | 0 | 2
append-transform | 0.4.0 | 0 | 2
archy | 1.0.0 | 0 | 2
arr-diff | 4.0.0 | 0 | 2
arr-flatten | 1.1.0 | 0 | 2
arr-union | 3.1.0 | 0 | 2
array-equal | 1.0.0 | 0 | 2
array-unique | 0.3.2 | 0 | 2
arrify | 1.0.1 | 0 | 2
asn1 | 0.2.4 | 0 | 2
assert-plus | 1.0.0 | 0 | 2
assign-symbols | 1.0.0 | 0 | 2
async | 1.5.2 | 0 | 2
async-limiter | 1.0.1 | 0 | 2
asynckit | 0.4.0 | 0 | 2
atob | 2.1.1 | 0 | 2
aws-sign2 | 0.7.0 | 0 | 2
aws4 | 1.9.0 | 0 | 2
babel-code-frame | 6.26.0 | 0 | 2
babel-generator | 6.26.1 | 0 | 2
babel-messages | 6.23.0 | 0 | 2
babel-runtime | 6.26.0 | 0 | 2
babel-template | 6.26.0 | 0 | 2
babel-traverse | 6.26.0 | 0 | 2
babel-types | 6.26.0 | 0 | 2
babylon | 6.18.0 | 0 | 2
balanced-match | 1.0.0 | 0 | 2
base | 0.11.2 | 0 | 2
bcrypt-pbkdf | 1.0.2 | 0 | 2
brace-expansion | 1.1.11 | 0 | 2
braces | 2.3.2 | 0 | 2
browser-process-hrtime | 0.1.3 | 0 | 2
builtin-modules | 1.1.1 | 0 | 2
cache-base | 1.0.1 | 0 | 2
caching-transform | 1.0.1 | 0 | 2
camelcase | 1.2.1 | 0 | 2
camelcase | 4.1.0 | 0 | 2
caseless | 0.12.0 | 0 | 2
center-align | 0.1.3 | 0 | 2
chalk | 1.1.3 | 0 | 2
class-utils | 0.3.6 | 0 | 2
cliui | 2.1.0 | 0 | 2
cliui | 4.1.0 | 0 | 2
code-point-at | 1.1.0 | 0 | 2
collection-visit | 1.0.0 | 0 | 2
combined-stream | 1.0.8 | 0 | 2
commondir | 1.0.1 | 0 | 2
component-emitter | 1.2.1 | 0 | 2
concat-map | 0.0.1 | 0 | 2
convert-source-map | 1.5.1 | 0 | 2
copy-descriptor | 0.1.1 | 0 | 2
core-js | 2.5.6 | 0 | 2
core-util-is | 1.0.2 | 0 | 2
cross-spawn | 4.0.2 | 0 | 2
cross-spawn | 5.1.0 | 0 | 2
cssom | 0.3.8 | 0 | 2
cssstyle | 1.4.0 | 0 | 2
dashdash | 1.14.1 | 0 | 2
data-urls | 1.1.0 | 0 | 2
debug | 2.6.9 | 0 | 2
debug | 3.1.0 | 0 | 2
debug-log | 1.0.1 | 0 | 2
decamelize | 1.1.1 | 0 | 2
decamelize | 1.2.0 | 0 | 2
decode-uri-component | 0.2.0 | 0 | 2
deep-is | 0.1.3 | 0 | 2
default-require-extensions | 1.0.0 | 0 | 2
define-property | 0.2.5 | 0 | 2
define-property | 1.0.0 | 0 | 2
define-property | 2.0.2 | 0 | 2
delayed-stream | 1.0.0 | 0 | 2
detect-indent | 4.0.0 | 0 | 2
domexception | 1.0.1 | 0 | 2
ecc-jsbn | 0.1.2 | 0 | 2
error-ex | 1.3.1 | 0 | 2
escape-string-regexp | 1.0.5 | 0 | 2
escodegen | 1.12.0 | 0 | 2
esprima | 3.1.3 | 0 | 2
estraverse | 4.3.0 | 0 | 2
esutils | 2.0.2 | 0 | 2
esutils | 2.0.3 | 0 | 2
execa | 0.7.0 | 0 | 2
expand-brackets | 2.1.4 | 0 | 2
extend | 3.0.2 | 0 | 2
extend-shallow | 2.0.1 | 0 | 2
extend-shallow | 3.0.2 | 0 | 2
extglob | 2.0.4 | 0 | 2
extsprintf | 1.3.0 | 0 | 2
fast-deep-equal | 2.0.1 | 0 | 2
fast-json-stable-stringify | 2.0.0 | 0 | 2
fast-levenshtein | 2.0.6 | 0 | 2
fill-range | 4.0.0 | 0 | 2
find-cache-dir | 0.1.1 | 0 | 2
find-up | 1.1.2 | 0 | 2
find-up | 2.1.0 | 0 | 2
for-in | 1.0.2 | 0 | 2
foreground-child | 1.5.6 | 0 | 2
forever-agent | 0.6.1 | 0 | 2
form-data | 2.3.3 | 0 | 2
fragment-cache | 0.2.1 | 0 | 2
fs.realpath | 1.0.0 | 0 | 2
get-caller-file | 1.0.2 | 0 | 2
get-stream | 3.0.0 | 0 | 2
get-value | 2.0.6 | 0 | 2
getpass | 0.1.7 | 0 | 2
glob | 7.1.2 | 0 | 2
globals | 9.18.0 | 0 | 2
graceful-fs | 4.1.11 | 0 | 2
handlebars | 4.0.11 | 0 | 2
har-schema | 2.0.0 | 0 | 2
har-validator | 5.1.3 | 0 | 2
has-ansi | 2.0.0 | 0 | 2
has-flag | 1.0.0 | 0 | 2
has-value | 0.3.1 | 0 | 2
has-value | 1.0.0 | 0 | 2
has-values | 0.1.4 | 0 | 2
has-values | 1.0.0 | 0 | 2
highlight.js | 9.16.2 | 0 | 2
hosted-git-info | 2.6.0 | 0 | 2
html-encoding-sniffer | 1.0.2 | 0 | 2
http-signature | 1.2.0 | 0 | 2
iconv-lite | 0.4.24 | 0 | 2
imurmurhash | 0.1.4 | 0 | 2
inflight | 1.0.6 | 0 | 2
inherits | 2.0.3 | 0 | 2
invariant | 2.2.4 | 0 | 2
invert-kv | 1.0.0 | 0 | 2
is-accessor-descriptor | 0.1.6 | 0 | 2
is-accessor-descriptor | 1.0.0 | 0 | 2
is-arrayish | 0.2.1 | 0 | 2
is-buffer | 1.1.6 | 0 | 2
is-builtin-module | 1.0.0 | 0 | 2
is-data-descriptor | 0.1.4 | 0 | 2
is-data-descriptor | 1.0.0 | 0 | 2
is-descriptor | 0.1.6 | 0 | 2
is-descriptor | 1.0.2 | 0 | 2
is-extendable | 0.1.1 | 0 | 2
is-extendable | 1.0.1 | 0 | 2
is-finite | 1.0.2 | 0 | 2
is-fullwidth-code-point | 1.0.0 | 0 | 2
is-fullwidth-code-point | 2.0.0 | 0 | 2
is-number | 3.0.0 | 0 | 2
is-number | 4.0.0 | 0 | 2
is-odd | 2.0.0 | 0 | 2
is-plain-object | 2.0.4 | 0 | 2
is-stream | 1.1.0 | 0 | 2
is-typedarray | 1.0.0 | 0 | 2
is-utf8 | 0.2.1 | 0 | 2
is-windows | 1.0.2 | 0 | 2
isarray | 1.0.0 | 0 | 2
isexe | 2.0.0 | 0 | 2
isobject | 2.1.0 | 0 | 2
isobject | 3.0.1 | 0 | 2
isstream | 0.1.2 | 0 | 2
istanbul-lib-coverage | 1.2.0 | 0 | 2
istanbul-lib-hook | 1.1.0 | 0 | 2
istanbul-lib-instrument | 1.10.1 | 0 | 2
istanbul-lib-report | 1.1.3 | 0 | 2
istanbul-lib-source-maps | 1.2.3 | 0 | 2
istanbul-reports | 1.4.0 | 0 | 2
js-tokens | 3.0.2 | 0 | 2
jsbn | 0.1.1 | 0 | 2
jsdom | 11.12.0 | 0 | 2
jsesc | 1.3.0 | 0 | 2
json-schema | 0.2.3 | 0 | 2
json-schema-traverse | 0.4.1 | 0 | 2
json-stringify-safe | 5.0.1 | 0 | 2
jsprim | 1.4.1 | 0 | 2
kind-of | 3.2.2 | 0 | 2
kind-of | 4.0.0 | 0 | 2
kind-of | 5.1.0 | 0 | 2
kind-of | 6.0.2 | 0 | 2
lazy-cache | 1.0.4 | 0 | 2
lcid | 1.0.0 | 0 | 2
left-pad | 1.3.0 | 0 | 2
levn | 0.3.0 | 0 | 2
load-json-file | 1.1.0 | 0 | 2
locate-path | 2.0.0 | 0 | 2
lodash | 4.17.10 | 0 | 2
lodash | 4.17.15 | 0 | 2
lodash.sortby | 4.7.0 | 0 | 2
longest | 1.0.1 | 0 | 2
loose-envify | 1.3.1 | 0 | 2
lru-cache | 4.1.3 | 0 | 2
map-cache | 0.2.2 | 0 | 2
map-visit | 1.0.0 | 0 | 2
md5-hex | 1.3.0 | 0 | 2
md5-o-matic | 0.1.1 | 0 | 2
mem | 1.1.0 | 0 | 2
merge-source-map | 1.1.0 | 0 | 2
micromatch | 3.1.10 | 0 | 2
mime-db | 1.42.0 | 0 | 2
mime-types | 2.1.25 | 0 | 2
mimic-fn | 1.2.0 | 0 | 2
minimatch | 3.0.4 | 0 | 2
minimist | 0.0.8 | 0 | 2
mixin-deep | 1.3.1 | 0 | 2
mkdirp | 0.5.1 | 0 | 2
moment | 2.24.0 | 0 | 2
ms | 2.0.0 | 0 | 2
nanomatch | 1.2.9 | 0 | 2
normalize-package-data | 2.4.0 | 0 | 2
normalize.css | 7.0.0 | 0 | 2
npm-run-path | 2.0.2 | 0 | 2
number-is-nan | 1.0.1 | 0 | 2
nwsapi | 2.2.0 | 0 | 2
nyc | 11.9.0 | 0 | 2
oauth-sign | 0.9.0 | 0 | 2
object-assign | 4.1.1 | 0 | 2
object-copy | 0.1.0 | 0 | 2
object-visit | 1.0.1 | 0 | 2
object.pick | 1.3.0 | 0 | 2
once | 1.4.0 | 0 | 2
optimist | 0.6.1 | 0 | 2
optionator | 0.8.3 | 0 | 2
os-homedir | 1.0.2 | 0 | 2
os-locale | 2.1.0 | 0 | 2
p-finally | 1.0.0 | 0 | 2
p-limit | 1.2.0 | 0 | 2
p-locate | 2.0.0 | 0 | 2
p-try | 1.0.0 | 0 | 2
parse-json | 2.2.0 | 0 | 2
parse5 | 4.0.0 | 0 | 2
pascalcase | 0.1.1 | 0 | 2
path-exists | 2.1.0 | 0 | 2
path-exists | 3.0.0 | 0 | 2
path-is-absolute | 1.0.1 | 0 | 2
path-key | 2.0.1 | 0 | 2
path-parse | 1.0.5 | 0 | 2
path-type | 1.1.0 | 0 | 2
performance-now | 2.1.0 | 0 | 2
pify | 2.3.0 | 0 | 2
pinkie | 2.0.4 | 0 | 2
pinkie-promise | 2.0.1 | 0 | 2
pkg-dir | 1.0.0 | 0 | 2
pn | 1.1.0 | 0 | 2
posix-character-classes | 0.1.1 | 0 | 2
prelude-ls | 1.1.2 | 0 | 2
pseudomap | 1.0.2 | 0 | 2
psl | 1.5.0 | 0 | 2
punycode | 1.4.1 | 0 | 2
punycode | 2.1.1 | 0 | 2
qs | 6.5.2 | 0 | 2
read-pkg | 1.1.0 | 0 | 2
read-pkg-up | 1.0.1 | 0 | 2
regenerator-runtime | 0.11.1 | 0 | 2
regex-not | 1.0.2 | 0 | 2
repeat-element | 1.1.2 | 0 | 2
repeat-string | 1.6.1 | 0 | 2
repeating | 2.0.1 | 0 | 2
request | 2.88.0 | 0 | 2
request-promise-core | 1.1.3 | 0 | 2
request-promise-native | 1.0.8 | 0 | 2
require-directory | 2.1.1 | 0 | 2
require-main-filename | 1.0.1 | 0 | 2
resolve-from | 2.0.0 | 0 | 2
resolve-url | 0.2.1 | 0 | 2
ret | 0.1.15 | 0 | 2
right-align | 0.1.3 | 0 | 2
rimraf | 2.6.2 | 0 | 2
safe-buffer | 5.1.2 | 0 | 2
safe-regex | 1.1.0 | 0 | 2
safer-buffer | 2.1.2 | 0 | 2
sax | 1.2.4 | 0 | 2
semver | 5.5.0 | 0 | 2
set-blocking | 2.0.0 | 0 | 2
set-value | 0.4.3 | 0 | 2
set-value | 2.0.0 | 0 | 2
shebang-command | 1.2.0 | 0 | 2
shebang-regex | 1.0.0 | 0 | 2
signal-exit | 3.0.2 | 0 | 2
slide | 1.1.6 | 0 | 2
snapdragon | 0.8.2 | 0 | 2
snapdragon-node | 2.1.1 | 0 | 2
snapdragon-util | 3.0.1 | 0 | 2
source-map | 0.4.4 | 0 | 2
source-map | 0.5.7 | 0 | 2
source-map | 0.6.1 | 0 | 2
source-map-resolve | 0.5.1 | 0 | 2
source-map-url | 0.4.0 | 0 | 2
spawn-wrap | 1.4.2 | 0 | 2
spdx-correct | 3.0.0 | 0 | 2
spdx-exceptions | 2.1.0 | 0 | 2
spdx-expression-parse | 3.0.0 | 0 | 2
spdx-license-ids | 3.0.0 | 0 | 2
split-string | 3.1.0 | 0 | 2
sshpk | 1.16.1 | 0 | 2
static-extend | 0.1.2 | 0 | 2
stealthy-require | 1.1.1 | 0 | 2
string-width | 1.0.2 | 0 | 2
string-width | 2.1.1 | 0 | 2
strip-ansi | 3.0.1 | 0 | 2
strip-ansi | 4.0.0 | 0 | 2
strip-bom | 2.0.0 | 0 | 2
strip-eof | 1.0.0 | 0 | 2
supports-color | 2.0.0 | 0 | 2
supports-color | 3.2.3 | 0 | 2
symbol-tree | 3.2.4 | 0 | 2
test-exclude | 4.2.1 | 0 | 2
to-fast-properties | 1.0.3 | 0 | 2
to-object-path | 0.3.0 | 0 | 2
to-regex | 3.0.2 | 0 | 2
to-regex-range | 2.1.1 | 0 | 2
tough-cookie | 2.4.3 | 0 | 2
tr46 | 1.0.1 | 0 | 2
trim-right | 1.0.1 | 0 | 2
tunnel-agent | 0.6.0 | 0 | 2
tweetnacl | 0.14.5 | 0 | 2
type-check | 0.3.2 | 0 | 2
uglify-js | 2.8.29 | 0 | 2
uglify-to-browserify | 1.0.2 | 0 | 2
union-value | 1.0.0 | 0 | 2
unset-value | 1.0.0 | 0 | 2
uri-js | 4.2.2 | 0 | 2
urix | 0.1.0 | 0 | 2
use | 3.1.0 | 0 | 2
uuid | 3.3.3 | 0 | 2
validate-npm-package-license | 3.0.3 | 0 | 2
verror | 1.10.0 | 0 | 2
w3c-hr-time | 1.0.1 | 0 | 2
webidl-conversions | 4.0.2 | 0 | 2
whatwg-encoding | 1.0.5 | 0 | 2
whatwg-mimetype | 2.3.0 | 0 | 2
whatwg-url | 6.5.0 | 0 | 2
whatwg-url | 7.1.0 | 0 | 2
which | 1.3.0 | 0 | 2
which-module | 2.0.0 | 0 | 2
window-size | 0.1.0 | 0 | 2
word-wrap | 1.2.3 | 0 | 2
wordwrap | 0.0.2 | 0 | 2
wordwrap | 0.0.3 | 0 | 2
wrap-ansi | 2.1.0 | 0 | 2
wrappy | 1.0.2 | 0 | 2
write-file-atomic | 1.3.4 | 0 | 2
ws | 5.2.2 | 0 | 2
xml-name-validator | 3.0.0 | 0 | 2
y18n | 3.2.1 | 0 | 2
yallist | 2.1.2 | 0 | 2
yargs | 11.1.0 | 0 | 2
yargs | 3.10.0 | 0 | 2
yargs-parser | 8.1.0 | 0 | 2
yargs-parser | 9.0.2 | 0 | 2
(367 rows)