Skip to content

Disable scheduling of productivity analytics

Yorick Peterse requested to merge revert-productivity-analytics-migrations into master

What does this MR do?

This migration times out on both GitLab's staging and production environments, even when an index is added for the columns used. As we are nearing release day we have decided to turn this migration into a noop for the time being.

The background migration is not removed as some jobs may have been scheduled (especially in dev environments). Keeping this code allows those jobs to finish, and allows us to reschedule it in the future if needed.

The migration (even with the index present) timed out in deployment https://ops.gitlab.net/gitlab-com/gl-infra/deployer/-/jobs/640029:

    Caused by:
    PG::QueryCanceled: ERROR:  canceling statement due to statement timeout
    /opt/gitlab/embedded/service/gitlab-rails/app/models/concerns/each_batch.rb:78:in `block in each_batch'
    /opt/gitlab/embedded/service/gitlab-rails/app/models/concerns/each_batch.rb:68:in `step'
    /opt/gitlab/embedded/service/gitlab-rails/app/models/concerns/each_batch.rb:68:in `each_batch'
    /opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/migration_helpers.rb:1020:in `queue_background_migration_jobs_by_range_at_intervals'
    /opt/gitlab/embedded/service/gitlab-rails/db/post_migrate/20190918104222_schedule_productivity_analytics_backfill.rb:25:in `up'
    /opt/gitlab/embedded/bin/bundle:23:in `load'
    /opt/gitlab/embedded/bin/bundle:23:in `<main>'
    Tasks: TOP => db:migrate
    (See full trace by running task with --trace)
  stderr_lines:
  - rake aborted!
  - 'StandardError: An error has occurred, all later migrations canceled:'
  - ''
  - 'PG::QueryCanceled: ERROR:  canceling statement due to statement timeout'
  - ': SELECT  "merge_request_metrics"."id" FROM "merge_request_metrics" WHERE (merged_at >= ''2019-06-19 12:29:36.376177'') AND "merge_request_metrics"."id" >= 153435 ORDER BY "merge_request_metrics"."id" ASC LIMIT 1 OFFSET 10000'
  - /opt/gitlab/embedded/service/gitlab-rails/app/models/concerns/each_batch.rb:78:in `block in each_batch'
  - /opt/gitlab/embedded/service/gitlab-rails/app/models/concerns/each_batch.rb:68:in `step'
  - /opt/gitlab/embedded/service/gitlab-rails/app/models/concerns/each_batch.rb:68:in `each_batch'
  - /opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/migration_helpers.rb:1020:in `queue_background_migration_jobs_by_range_at_intervals'
  - /opt/gitlab/embedded/service/gitlab-rails/db/post_migrate/20190918104222_schedule_productivity_analytics_backfill.rb:25:in `up'
  - /opt/gitlab/embedded/bin/bundle:23:in `load'
  - /opt/gitlab/embedded/bin/bundle:23:in `<main>'
  - ''
  - 'Caused by:'
  - 'ActiveRecord::QueryCanceled: PG::QueryCanceled: ERROR:  canceling statement due to statement timeout'
  - ': SELECT  "merge_request_metrics"."id" FROM "merge_request_metrics" WHERE (merged_at >= ''2019-06-19 12:29:36.376177'') AND "merge_request_metrics"."id" >= 153435 ORDER BY "merge_request_metrics"."id" ASC LIMIT 1 OFFSET 10000'
  - /opt/gitlab/embedded/service/gitlab-rails/app/models/concerns/each_batch.rb:78:in `block in each_batch'
  - /opt/gitlab/embedded/service/gitlab-rails/app/models/concerns/each_batch.rb:68:in `step'
  - /opt/gitlab/embedded/service/gitlab-rails/app/models/concerns/each_batch.rb:68:in `each_batch'
  - /opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/migration_helpers.rb:1020:in `queue_background_migration_jobs_by_range_at_intervals'
  - /opt/gitlab/embedded/service/gitlab-rails/db/post_migrate/20190918104222_schedule_productivity_analytics_backfill.rb:25:in `up'
  - /opt/gitlab/embedded/bin/bundle:23:in `load'
  - /opt/gitlab/embedded/bin/bundle:23:in `<main>'
  - ''
  - 'Caused by:'
  - 'PG::QueryCanceled: ERROR:  canceling statement due to statement timeout'
  - /opt/gitlab/embedded/service/gitlab-rails/app/models/concerns/each_batch.rb:78:in `block in each_batch'
  - /opt/gitlab/embedded/service/gitlab-rails/app/models/concerns/each_batch.rb:68:in `step'
  - /opt/gitlab/embedded/service/gitlab-rails/app/models/concerns/each_batch.rb:68:in `each_batch'
  - /opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/migration_helpers.rb:1020:in `queue_background_migration_jobs_by_range_at_intervals'
  - /opt/gitlab/embedded/service/gitlab-rails/db/post_migrate/20190918104222_schedule_productivity_analytics_backfill.rb:25:in `up'
  - /opt/gitlab/embedded/bin/bundle:23:in `load'
  - /opt/gitlab/embedded/bin/bundle:23:in `<main>'
  - 'Tasks: TOP => db:migrate'
  - (See full trace by running task with --trace)
  stdout: |-

Per @pshutsin, the feature that wanted this data can work fine without it. Because of this we are only disabling the scheduling of the background migration, instead of also reverting the feature.

This means that for 12.4 or futures releases a better approach for scheduling should be implemented. Even once the index starts getting used by PostgreSQL (we're not sure why it did not immediately use it), the queries take around 4 seconds to run. Depending on how large the table is, this means the scheduling could take a very long time.

Related merge requests/issues:

Screenshots

Does this MR meet the acceptance criteria?

Conformity

Edited by Yorick Peterse

Merge request reports