Skip to content

[BB-4059] Stop all workers on appservers that failed provisioning

Created by: 0x29a

If Ansible process exited with non-zero code, we want to stop celery workers, because they are stealing tasks from other, healthy app servers.

JIRA tickets:

Sandbox URL:

Testing instructions:

What I did:

  1. Created a branch in open-craft/configuration repo with change that makes main playbook fail early.
  2. Created instance and configured it to use this branch.
  3. Then I switched staging Ocim to 0x29a/bb4059/stop_services_if_provision_failed branch and created appserver.

What reviewer has to test:

  1. Go to https://stage.manage.opencraft.com/instance/7920/edx-appserver/3309/ and click on "Logs".
  2. Check these logs and verify that after configuration failed, Ocim tried to stop celery workers.

Reviewers

Merge request reports