GitGuardian: Generate a descriptive output about findings

Overview

Within GitGuardian pre-receive secrets detection (&11494 - closed) we implemented a GitGuardian integration that rejects the pushes if a commit violated a policy check. We output something like:

Secrets detection policy violated at .env for Basic Auth String 'jen_barber'`

However, the json returned by API also returns indexes to get the place in the file that actually violated a policy:

JSON content

{
    "policies": [
        "Secrets detection"
    ],
    "policy_break_count": 1,
    "policy_breaks": [
        {
            "incident_url": "",
            "known_secret": false,
            "matches": [
                {
                    "index_end": 48,
                    "index_start": 39,
                    "line_end": 1,
                    "line_start": 1,
                    "match": "jen_barber",
                    "type": "username"
                },
                {
                    "index_end": 74,
                    "index_start": 50,
                    "line_end": 1,
                    "line_start": 1,
                    "match": "correcthorsebatterystaple",
                    "type": "password"
                },
                {
                    "index_end": 95,
                    "index_start": 76,
                    "line_end": 1,
                    "line_start": 1,
                    "match": "cake.gitguardian.com",
                    "type": "host"
                }
            ],
            "policy": "Secrets detection",
            "type": "Basic Auth String",
            "validity": "no_checker"
        }
    ]
}

Proposal

When ggshield tool is run, the output is richer:

ggshield secret scan pre-commit

Output

Scanned example.env
Scanning... ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 1 / 1

secrets-engine-version: 2.108.0

> commit://staged/example.env: 1 incident detected

>> Secret detected: Basic Auth String
   Validity: No Checker
   Occurrences: 1
   Known by GitGuardian dashboard: NO
   Incident URL: N/A
   Secret SHA: 5e9107aedc48b14af2749703ed3f83c2e1e6aca82ed86af980a2b925708c2da6

    | @@ -0,0 +1,4 @
  1 | import urllib.request
  2 | url = 'http://je******er:corre***************taple@cake************.com/isreal.json'
                    |_username_|
  2 | url = 'http://je******er:corre***************taple@cake************.com/isreal.json'
                               |________password_______|
  2 | url = 'http://je******er:corre***************taple@cake************.com/isreal.json'
                                                         |_______host_______|
  3 | response = urllib.request.urlopen(url)
  4 | consume(response.read())

> How to remediate

  Since the secret was detected before the commit was made:
  1. replace the secret with its reference (e.g. environment variable).
  2. commit again.

> [To apply with caution] If you want to bypass ggshield (false positive or other reason), run:
  - if you use the pre-commit framework:

     SKIP=ggshield git commit -m "<your message>"

  - otherwise (warning: the following command bypasses all pre-commit hooks):

     git commit -m "<your message>" --no-verify

We can also use the line/index start/end values to get the content from the scanned blobs and enrich the output. It shouldn't necessarily be as tricky as |________password_______| output. For the How to remediate section we can provide the information about the flag introduced in Allow bypassing GitGuardian integration (!147367 - merged) or provide a link to the documentation (which is yet to be created !147367 (comment 1821554906))