Technical Breakdown: Slack App

This is the technical breakdown for Introduce incident workflow to GitLab Slack App (gitlab-org&8545)

👀 Jump to engineering quickstart guide to start with a light overview (optional)

Architecture

Data flow / API interations - incident creation modal in Slack

Final flow Add /gitlab incident declare slack slash command Populate incident description field with template in slack modal Add assignees/labels dropdowns to incident slack modal
mermaid-diagram-20220907210513 mermaid-diagram-20220907211054 mermaid-diagram-20220907211151 mermaid-diagram-20220907211225
Source for sequence diagramsKey:
```mermaid
sequenceDiagram
  actor k as #160;
  
  rect rgb(245, 201, 147)
  note right of k: Add `/gitlab incident declare` slack slash command
  end
  rect rgb(241, 235, 245)
  note right of k: Populate incident description field with template in slack modal
  end
  rect rgb(235, 245, 238)
  note right of k: Add endpoint to return dropdown menu options
  end
  rect rgb(147, 176, 245)
  note right of k: Add option to incident slack modal to create slack channel for the incident
  end
```

Interactions:

```mermaid
sequenceDiagram
  participant S as Slack
  participant G as GitLab

  rect rgb(245, 201, 147)
  note right of S: /gitlab incident
  S->>+G: POST /slack/trigger (COMMAND_ENDPOINT)
  note over S,G: payload: slash command details
  G-->>-S: 200 OK
  end

  rect rgb(245, 201, 147)
  note left of G: display modal
  G->>+S: POST /api/views.open 
  note over S,G: payload: modal layout details
  S-->>-G: 200 OK
  end

  rect rgb(241, 235, 245)
  note right of S: select project
  S->>+G: POST 
  note over S,G: payload: block action details
  G-->>-S: 200 OK
  end

  rect rgb(241, 235, 245)
  note left of G: set description to incident template
  G->>+S: POST /api/views.update 
  note over S,G: payload: modal layout details
  S-->>-G: 200 OK
  end

  rect rgb(235, 245, 238)
  note right of S: open assignee/label dropdown
  S->>+G: POST 
  note over S,G: payload: identifiers for select menu
  G-->>-S: 200 OK
  note over S,G: payload: menu options
  note right of S: populate menu options
  end

  rect rgb(235, 245, 238)
  note right of S: search in assignee/label dropdown
  S->>+G: POST 
  note over S,G: payload: identifiers for select menu & query
  G-->>-S: 200 OK
  note over S,G: payload: menu options
  note right of S: replace menu options
  end

  rect rgb(245, 201, 147)
  note right of S: submit modal
  S->>+G: POST /slack/trigger (COMMAND_ENDPOINT)
  note over S,G: payload: modal submission data
  G-->>-S: 200 OK
  end

  rect rgb(147, 176, 245)
  note left of G: create channel
  G->>+S: POST /api/conversations.create
  note over S,G: payload: channel details
  S-->>-G: 200 OK
  end

  rect rgb(245, 201, 147)
  note left of G: display confirmation/error message
  G->>+S: POST response_url
  note over S,G: payload: message details
  S-->>-G: 200 OK
  end
```

Storing identifiers for continuity

We may need to cache response_urls & view_ids. 🤔 Definitely for populating the project dropdown.

Sample payloads

`/gitlab incident declare` request

Request from slack:

{
    "token": "CWVqRxRio8bBeeegUxLDb3Mg",
    "team_id": "T03TQEUUA49",
    "team_domain": "yasoniktestworkspace",
    "channel_id": "C03T57PAATY",
    "channel_name": "gitlab-integration",
    "user_id": "U03T8SYJQQ5",
    "user_name": "syasonik_slack",
    "command": "/incident",
    "text": "",
    "api_app_id": "A040FM0LZGF",
    "is_enterprise_install": "false",
    "response_url": "https://hooks.slack.com/commands/T03TQEUUA49/4028601015493/6NUBH1KYAfdisfRFwQ0Lf4ij",
    "trigger_id": "4044195805825.3942504962145.4458391ced435f427a66d2b8a7f62ad8"
}

Response format:

200 OK
display modal request

Request from Gitlab:

{
  "ok": true,
  "view": {
    "id": "V040H7HDNB1",
    "team_id": "T03TQEUUA49",
    "type": "modal",
    "blocks": [
      {
        "type": "input",
        "block_id": "incident-title-block",
        "label": { "type": "plain_text", "text": "Title", "emoji": true },
        "optional": false,
        "dispatch_action": false,
        "element": {
          "type": "plain_text_input",
          "action_id": "incident-title",
          "placeholder": {
            "type": "plain_text",
            "text": "Write a title",
            "emoji": true
          },
          "dispatch_action_config": {
            "trigger_actions_on": ["on_enter_pressed"]
          }
        }
      },
      {
        "type": "input",
        "block_id": "incident-project-block",
        "label": { "type": "plain_text", "text": "Project", "emoji": true },
        "optional": false,
        "dispatch_action": false,
        "element": {
          "type": "static_select",
          "action_id": "incident-project",
          "placeholder": {
            "type": "plain_text",
            "text": "Select a project alias",
            "emoji": true
          },
          "initial_option": {
            "text": {
              "type": "plain_text",
              "text": "gitlab-shell",
              "emoji": true
            },
            "value": "gitlab-shell"
          },
          "options": [
            {
              "text": {
                "type": "plain_text",
                "text": "gitlab-shell",
                "emoji": true
              },
              "value": "gitlab-shell"
            },
            {
              "text": {
                "type": "plain_text",
                "text": "gitlab-ui",
                "emoji": true
              },
              "value": "gitlab-ui"
            },
            {
              "text": { "type": "plain_text", "text": "gitlab", "emoji": true },
              "value": "gitlab"
            }
          ]
        }
      },
      {
        "type": "input",
        "block_id": "incident-description-block",
        "label": { "type": "plain_text", "text": "Description", "emoji": true },
        "optional": false,
        "dispatch_action": false,
        "element": {
          "type": "plain_text_input",
          "action_id": "incident-description",
          "placeholder": {
            "type": "plain_text",
            "text": "Write a description...\n\n[Supports GitLab-flavored markdown, including quick actions]",
            "emoji": true
          },
          "multiline": true,
          "dispatch_action_config": {
            "trigger_actions_on": ["on_enter_pressed"]
          }
        }
      }
    ],
    "private_metadata": "",
    "callback_id": "create_incident",
    "state": {
      "values": {
        "incident-project-block": {
          "incident-project": {
            "type": "static_select",
            "selected_option": {
              "text": {
                "type": "plain_text",
                "text": "gitlab-shell",
                "emoji": true
              },
              "value": "gitlab-shell"
            }
          }
        }
      }
    },
    "hash": "1662164854.rD91JUfE",
    "title": { "type": "plain_text", "text": "New incident", "emoji": true },
    "clear_on_close": false,
    "notify_on_close": false,
    "close": { "type": "plain_text", "text": "Cancel", "emoji": true },
    "submit": { "type": "plain_text", "text": "Submit", "emoji": true },
    "previous_view_id": null,
    "root_view_id": "V040H7HDNB1",
    "app_id": "A040FM0LZGF",
    "external_id": "",
    "app_installed_team_id": "T03TQEUUA49",
    "bot_id": "B040PNPQRAS"
  },
  "warning": "missing_charset",
  "response_metadata": { "warnings": ["missing_charset"] }
}

Response format:

select project request

Request from Slack:

--> Under payload key

{
	"type": "block_actions",
	"user": {
		"id": "U03T8SYJQQ5",
		"username": "syasonik_slack",
		"name": "syasonik_slack",
		"team_id": "T03TQEUUA49"
	},
	"api_app_id": "A02",
	"token": "Shh_its_a_seekrit",
	"container": {
		"type": "message",
		"text": "The contents of the original message where the action originated"
	},
	"trigger_id": "12466734323.1395872398",
	"team": {
		"id": "T03TQEUUA49",
		"domain": "yasoniktestworkspace"
	},
	"enterprise": null,
	"is_enterprise_install": false,
	"state": {
		"values": {
			"incident-title-block": {
				"incident-title": {
					"type": "plain_text_input",
					"value": null
				}
			},
			"incident-project-block": {
				"incident-project": {
					"type": "static_select",
					"selected_option": {
						"text": {
							"type": "plain_text",
							"text": "gitlab-shell",
							"emoji": true
						},
						"value": "gitlab-shell"
					}
				}
			},
			"incident-description-block": {
				"incident-description": {
					"type": "plain_text_input",
					"value": null
				}
			},
			"AJxCB": {
				"multi_conversations_select-action": {
					"type": "conversations_select",
					"selected_conversation": null
				}
			},
			"imUU": {
				"plain_text_input-action": {
					"type": "plain_text_input",
					"value": null
				}
			},
			"sEkW": {
				"multi_static_select-action": {
					"type": "multi_static_select",
					"selected_options": []
				}
			},
			"yiK": {
				"multi_static_select-action": {
					"type": "multi_static_select",
					"selected_options": []
				}
			}
		}
	},
	"response_url": "https://www.postresponsestome.com/T123567/1509734234",
	"actions": [
		{
			"type": "multi_static_select",
			"block_id": "sEkW",
			"action_id": "multi_static_select-action",
			"selected_options": [],
			"placeholder": {
				"type": "plain_text",
				"text": "Select options",
				"emoji": true
			},
			"action_ts": "1662496139.625914"
		}
	]
}

Response format:

200 OK
set description to incident template request

Request from Gitlab:

Response format:

open assignee/label dropdown request

Request from Slack:

Response format:

search for assignee/label dropdown request

Request from Slack:

Response format:

submit modal request

Request from Slack:

Response format:

create channel request

Request from Slack:

Response format:

display confirmation/error message request

Request from Slack:

Response format:

Assumptions & Limitations

Incident creation has to be quick, so we don't want to use runners/pipelines

@dawsmith mentioned some challenges with the Woodhouse implementation that will be important to consider when moving forward:

Chatops (like in #production) relies on CI runners to execute a command. Current chatops interactions rely on CI runners on ops.gitlab.net. Woodhouse uses GCP Cloud functions (like AWS Lambda) to execute the tasks so is different than Chatops. When starting an incident, we don't usually want to wait a long time for runners to spin up and execute code and if there is a failure to fail all commands.

We won't create slack shortcuts for now -> just slash commands

We have a number of options for triggering an incident creation modal:

  1. Via slash command
  2. Via shortcut / message menu / app's home tab

The slack shortcut & others would allow incident creation from anywhere in slack without a slash command. However, if we wanted to respond to the modal submission with a confirmation message, we'll need to include a channel selection dropdown in the modal to specify where to post the confirmation message.

This is absolutely something we can look at doing in future, but doesn't make sense for a first iteration, since the payload formats also vary somewhat between slash commands and shortcuts+.

We won't create a /gitlab <project-alias> incident new Slack slash command

We want to make incident creation as easy as possible for users, which means starting with one specific way to create an incident. Even though /gitlab <project-alias> incident new matches the issue slash command format, we wouldn't want to keep it after implementing the modal. So we're going to jump straight to the modal.

We won't worry about self-managed for the time being

Some of the functionality we ship may be available for self-managed instances, but ~"group::integrations" will be managing the transition of existing slash command functionality for self-managed instances to the new Slack app format.

Anything which relies on the Slack app itself will not initially be available for self-managed instances.

We will coordinate with ~"group::integrations" to manage Slack app re-approval process

Changes to Slack bot permissions require Slack apps to be re-submitted for review. ~"group::integrations" will provide support if/when we need to expand the bot's permissions. If complete details are available for our internal submission process, we should follow those instructions.

We will implement incident notifications alongside ~"group::integrations" notifications update

As ~"group::integrations" migrates the existing Slack notifications integration into the Slack app integration, there will be a permissions update to include notifications. Once they've finalized the code organization/structure for the new version of notifications, we can add our incident-only trigger.

We will not implement the incident-only notification using the existing notifications integration.

We won't make ~"group::integrations" wait on our features to rollout theirs

We'll need to add new bot permissions to the Slack app in order to create a new slack channel for an incident. Instead of trying to include this in ~"group::integrations"'s notifications rollout, we'll assume this would be a too-tight timeline. We'll focus on the incident creation/updates flows initially.

Permissions & Security

  • Access should match the user permissions -> If user can create incidents, they can create incidents via Slack.
  • All slack slash commands should be visible via /gitlab help, regardless of user/permissions.
  • Permissions-based errors should be handled when a user attempts to perform an action
  • Details on slack permissions, review process & timeline
  • For the moment, we don't want to switch to the Slack-recommended source verification methodology, preferring to keep with our existing usage of token comparison

Frontend

We won't have explicitly frontend work, unless we decide to relocate the incident template setting.

Backend

Database

No new database work is currently expected.

API

See:

Scheduled/Async jobs

No new async jobs are currently expected.

Rollout

Incident-only notifications will roll out in ~ %15.7 alongside the Slack app update led by the ~"group::integrations" team.

Manual testing for all slack-related features should include regression testing for existing slash commands & error cases (permissions, save failures, missing aliases, invalid options, etc).

Feature flags

Incident creation:

  • Let's use :incident_creation_slash_command feature flag, rolled out by user.
  • Because project selection is part of the modal, we can't rollout this feature flag by project (while preventing a bunch of SQL queries). To do that, we'd effectively need to run the new code, and then return a permissions error, which defeats the point somewhat.
  • Rolling out by user allows us to individually enable the slash command early for testing/incremental development without impacting the existing logic.

Remaining tasks:

  • We should not use a feature flag for individually valuable tasks, like /link.

Metrics

We could track:

  • unique users using any incident slash command
  • unique users using each incident slash command
  • unique projects using any incident slash commands
  • count of new incidents created via slack
  • count of new incidents events (overall)
  • count of new timeline events created via slack
  • count of new timeline events (overall)
  • count of new incident comments created via slack
  • count of new incident comments (overall)

Documentation

Documentation should primarily be added to https://docs.gitlab.com/ee/integration/slash_commands.html, and potentially https://docs.gitlab.com/ee/user/project/integrations/gitlab_slack_application.html.

Many of the features we want to implement in Slack are already GitLab quick actions. If we improve documentation on how to utilize quick actions for incident management via slack, we could provide value without requiring a direct update to the slack application itself (and thus no required reinstall).

Issue breakdown

Relocated to child epics in gitlab-org&8545.

Definition of done

The following are required prior to starting implementation:

  • Draft of breakdown issue was posted to #g_respond with a request for feedback
  • At least one engineer provided feedback
  • Scope, timeline/dependencies, and technical limitations are fully defined
  • Detailed breakdown issue was posted to #g_respond with a last call for feedback
  • All needed issues are created & linked to the appropriate epic
  • All open threads are resolved or determined non-blocking

Open questions

  1. Quick action docs technically say that some incident-specific quick actions are available for issues. Do we need to remedy that?
Edited by Sarah Yasonik