Web Application Firewall for Kubernetes Cluster MVC

Problem to solve

Organizations today deploy web applications which are subject to many different network-based attacks, many of which occur at the HTTP level, beyond where traditional firewalls are effective. We want to be able to identify malicious HTTP traffic based on the contents in the HTTP messages before they reach the rest of the web app so we can either log or drop the traffic.

Intended users

Sidney (Systems Administrator)
Sam (Security Analyst)
Devon (DevOps Engineer)

Details

At a high-level, this issue is about installing the ModSecurity WAF into the Nginx Ingress controller of a Kubernetes cluster. Configure the WAF in a report-only mode to illustrate that the WAF is working correctly and for us to start learning more.

Do not block traffic automatically, as our Security Paradigm is to not block unless explicitly asked to do otherwise. The efforts to enable blocking are out of scope of this MVC and will be done in future issues, since there are implications we will need to work through with respect to UX, how to inform users about blocked traffic, and how to give controls over false positives.

Proposal

Allow users to install ModSecurity into the Ingress controller of a Kubernetes cluster in detection-only mode.
- Allow this for both existing clusters or clusters created with our GKE-integration.
- I propose that this is another "Application" under the Kubernetes cluster configuration screen. Would love input on this point if there is a better spot though.
Pre-configure the WAF with the default OWASP rules.
- Always configure with this rule set
- Do not expose this as a configuration option to the user for them to provide their own rules. (That will be done in a future issue).
Allow users to uninstall the WAF, once installed, if desired.
- This action should be done in the same location as where the user chose to install the WAF
- Do this for both existing clusters or clusters created with our GKE-integration.
Expose the logs produced by the WAF to the user
- Get input from team here on best place & way to do this. Look at the logs of a pod, create a dedicated screen/tab somewhere (cluster management or environment screen perhaps), download a log file, something else?.
Implement adoption and usage metrics
- Report back that installation of WAF in the project was performed
- Report back that removal of WAF in the project was performed
- Leverage the existing product reporting mechanisms
  - Anything more specific needed here to differentiate between usage ping or GitLab.com?

Documentation

Documentation should be created and/or updated to cover:

The problem a WAF solves
What a WAF is at a high-level
What the GitLab WAF is pre-configured to detect and report on
How the GitLab WAF fits inside the customer's application architecture
How to enable the GitLab WAF inside of an application
How to configure the GitLab WAF
How to remove the GitLab WAF
How to consume the results produced by the GitLab WAF

Testing

There are several testing aspects that should be focused on for this MVC that roughly map to the different parts of the user experience:

Installation/configuration of the WAF
- Users should be able to successfully install and configure the WAF if they are using the reference application architecture and deployment model.
- GitLab should fail gracefully if they have an incompatible application architecture or deployment model.
Use of the application with the WAF installed
- The application should still continue to function properly for legitimate traffic.
- The WAF should not block malicious traffic unless it has been configured to specifically do so.
- The various interfaces to consume the WAF results should be correctly displaying events as they occur.
Removal of the WAF
Use of the application after the WAF has been uninstalled
- We need to confirm that uninstallation of the WAF has no adverse ongoing effects on the application. It should behave identically to before it ever had a WAF installed.

[ ] @twoodham Schedule deep dive on requirements

What does success look like, and how can we measure that?

Percentage of newly created clusters with Ingress installed that still have the WAF enabled within 30 days of our release. Target => 75%

Since clusters will have WAF added when Ingress is installed, we should expect a high percentage of clusters to continue using the WAF if it is successful. If we see many users disabling or removing the WAF, that is an indicator we need to investigate.

Percentage of all GKE-integration clusters that have the WAF within 30 days of our release. Target => 10%

Percentage of the above users who continue using the WAF in their deployed app for at least 30 days. Target => 75%

This metric is important to ensure that customers are getting enough value out of the GitLab WAF to continue using it beyond an initial exploration period.
Initially measure 60 days after release. Periodically re-measure.

Ingress controllers with ModSecurity enabled.

Links / references

ModSecurity WAF
Product discovery at https://gitlab.com/gitlab-org/gitlab-ee/issues/9520.
The Nginx Ingress controller supports ModSecurity and allows to enable it via annotations: - https://kubernetes.github.io/ingress-nginx/user-guide/third-party-addons/modsecurity/ - https://kubernetes.github.io/ingress-nginx/user-guide/nginx-configuration/annotations/#modsecurity

Development Log

Status

gitlab-ce~24926493 work to add modsecurity to ingress integration ~~https://gitlab.com/gitlab-org/gitlab-ee/merge_requests/15774~~ https://gitlab.com/gitlab-org/gitlab-ce/merge_requests/32905
~~Consider upgrading nginx-ingress to get latest rules https://gitlab.com/gitlab-org/gitlab-ce/issues/61355~~ DEFERRED
~~Expose gitlab-managed-apps within Pod Logs~~ DEFERRED

Decisions

Ship behind feature flag to ensure performance change can be disabled if user considers it significant
Initially ship as a configuration of existing Ingress managed application, not a separate app
Ideally, logs will be fetchable via some UI like pod logs, but currently require manually tailing log from ingress controller pod

Edited Sep 19, 2019 by Lucas Charles