Skip to content

feat: implement periodic email notifications

Boros Gábor requested to merge gabor/periodic-faas into main

Description

This MR adds an OpenFAAS function to handle pipeline status notification webhooks.

The function fetches the pipeline details, gathers potential child pipelines and retrieves the job that failed earliest. Based on the job id, it retrieves the log file and attaches it to a notification email. To populate the email body, it uses the last 25 lines of the job log, which contains the root cause with a high chance.

In case a kubectl command fails, the logs may lack the most meaningful information, therefore a follow up issue and MR is scheduled. The reason for not handling that right now is that its out of scope for this MR and the related issue.

Supporting information

Testing instructions

Steps to test the changes:

Option 1

  1. Proofread and follow the instructions in the related documentation (the function's README)
  2. Ensure the repository has a webhook calling the function (with basic auth in the URL)
  3. Click the play button here
  4. Wait for pipeline failure -- cancel is not triggering!
  5. Check the configured recipient email

Option 2

  1. Proofread the function's instructinos
  2. Go to https://gitlab.com/opencraft/ops/grove-stage-digitalocean/-/pipeline_schedules and validate the pipeline schedule is set
  3. Go to https://openfaas.dev.grove-do.opencraft.hosting/ui/ and validate the function is available
  4. Go to https://gitlab.com/opencraft/ops/grove-stage-digitalocean/-/hooks and validate the webhook for https://openfaas.dev.grove-do.opencraft.hosting is set
  5. Go to https://gitlab.com/opencraft/ops/grove-stage-digitalocean/-/hooks/12817414/edit and validate the webhook calls are returning 200.
  6. Go to https://gitlab.com/opencraft/ops/grove-stage-digitalocean/-/jobs/2485943562 and validate it failed

Dependencies

N/A

Screenshots

Screenshot_2022-05-20_at_16.01.05

Checklist

If any of the items below is not applicable, do not remove them, but put a check in it.

  • All providers include the new feature/change
  • All affected providers can provision new clusters
  • Unit tests are added/updated Not added -- we should investigate the best practices for testing functions like this
  • Documentation is added/updated
  • The TOOLS_CONTAINER_IMAGE_VERSION in ci_vars.yml is updated
  • The grove-template repository is updated

Additional context

We had to add the CI_COMMIT_MESSAGE as a direct variable for generated pipelines, because only direct variables are returned within the content of the webhook. To not override any enviornment variables, the variable name is COMMIT_MESSAGE.

Merge request reports