feat: implement periodic email notifications
Description
This MR adds an OpenFAAS function to handle pipeline status notification webhooks.
The function fetches the pipeline details, gathers potential child pipelines and retrieves the job that failed earliest. Based on the job id, it retrieves the log file and attaches it to a notification email. To populate the email body, it uses the last 25 lines of the job log, which contains the root cause with a high chance.
In case a kubectl
command fails, the logs may lack the most meaningful information, therefore a follow up issue and MR is scheduled. The reason for not handling that right now is that its out of scope for this MR and the related issue.
Supporting information
Testing instructions
Steps to test the changes:
Option 1
- Proofread and follow the instructions in the related documentation (the function's README)
- Ensure the repository has a webhook calling the function (with basic auth in the URL)
- Click the play button here
- Wait for pipeline failure -- cancel is not triggering!
- Check the configured recipient email
Option 2
- Proofread the function's instructinos
- Go to https://gitlab.com/opencraft/ops/grove-stage-digitalocean/-/pipeline_schedules and validate the pipeline schedule is set
- Go to https://openfaas.dev.grove-do.opencraft.hosting/ui/ and validate the function is available
- Go to https://gitlab.com/opencraft/ops/grove-stage-digitalocean/-/hooks and validate the webhook for https://openfaas.dev.grove-do.opencraft.hosting is set
- Go to https://gitlab.com/opencraft/ops/grove-stage-digitalocean/-/hooks/12817414/edit and validate the webhook calls are returning
200
. - Go to https://gitlab.com/opencraft/ops/grove-stage-digitalocean/-/jobs/2485943562 and validate it failed
Dependencies
N/A
Screenshots
Checklist
If any of the items below is not applicable, do not remove them, but put a check in it.
-
All providers include the new feature/change -
All affected providers can provision new clusters -
Unit tests are added/updatedNot added -- we should investigate the best practices for testing functions like this -
Documentation is added/updated -
The TOOLS_CONTAINER_IMAGE_VERSION
in ci_vars.yml is updated -
The grove-template repository is updated
Additional context
We had to add the CI_COMMIT_MESSAGE
as a direct variable for generated pipelines, because only direct variables are returned within the content of the webhook. To not override any enviornment variables, the variable name is COMMIT_MESSAGE
.