Skip to content

Usage Ping timeouts on security_products_usage for large hosts

Parent Issue : https://gitlab.com/gitlab-org/telemetry/issues/308

The following discussion from !26194 (merged) should be addressed:

  • @a_akgun started a discussion: (+5 comments)

    @dstull deleted previous comment about reorganization of keys - let's discuss in telemetry sync

    • I think every section better be self-contained

    🤔

    • ci_builds is one of the most difficult tables
    • This returns {} this could be why we don't see dast in large instances.
    • The group by definitely times out in large instances ci_builds has ~400M rows, and is one of the largest tables.

    @dstull

    1. We could avoid the group and write each key on one line without the Hash.new(-1) trick which doesn't work in case of timeout. We end up with an empty hash {} instead of counters with -1 values.
    2. We could remove the batch: false (which I've put).
    3. We could test it in database-labs how it will perform results = count(::Ci::Build.where(name: :dast))
    4. We could run reach one in a line, to be more readable.

    Going forward I think we should reorganize all of the usage counters.

Edited by Doug Stull