Stop overriding spamcheck verdicts !=ALLOW to allow rescuing via reCAPTCHA. (!71496) · Merge requests · GitLab.org / GitLab

What does this MR do and why?

This MR introduces a minimal code change but a considerable paradigm shift in how to handle Spamcheck's verdicts.

As described in https://gitlab.com/gitlab-com/gl-security/engineering-and-research/automation-team/spam/spamcheck/-/issues/124, some spammers do indeed solve reCAPTCHAs (https://gitlab.com/gitlab-com/gl-security/engineering-and-research/automation-team/spam/spamcheck/-/issues/124#note_624256047).

This MR no longer allows for issue submitters to solve a reCAPTCHA in all cases three where Spamcheck's verdict is !=ALLOW, only when verdict is =CONDITIONAL_ALLOW.

By doing this, Spamcheck becomes the SSOT for whether or not to allow an issue submission or not, its verdicts shall not be overwritten, modified, etc.

Before the MR

    def spamcheck_verdict
      return unless Gitlab::CurrentSettings.spam_check_endpoint_enabled

      begin
        result, attribs, _error = spamcheck_client.issue_spam?(spam_issue: target, user: user, context: context)
        return [nil, attribs] unless result

        # @TODO log if error is not nil https://gitlab.com/gitlab-org/gitlab/-/issues/329545

        return [result, attribs] if result == NOOP || attribs["monitorMode"] == "true"

        # Duplicate logic with Akismet logic in #akismet_verdict
        if Gitlab::Recaptcha.enabled? && result != ALLOW
          [CONDITIONAL_ALLOW, attribs]
        else
          [result, attribs]
        end
      rescue StandardError => e
        Gitlab::ErrorTracking.log_exception(e)

        # Default to ALLOW if any errors occur
        [ALLOW, attribs, true]
      end
    end

non-admin, non-member user submits an issue to a public project
spam_verdict_service.rb requests verdict from Spamcheck
Spamcheck's evaluation leads to a spam value that passes the ALLOW and the CONDITIONAL_ALLOW threshold
Spamcheck returns DISALLOW or BLOCK depending of the final spam value
spam_verdict_service.rb overrides the non-ALLOW result to CONDITIONAL_ALLOW
user is presented with reCAPTCHA
reCAPTCHA is solved
submission of issue is successful

After the MR

    def spamcheck_verdict
      return unless Gitlab::CurrentSettings.spam_check_endpoint_enabled

      begin
        result, attribs, _error = spamcheck_client.issue_spam?(spam_issue: target, user: user, context: context)
        # @TODO log if error is not nil https://gitlab.com/gitlab-org/gitlab/-/issues/329545
        
        return [nil, attribs] unless result
        [result, attribs]  
      rescue StandardError => e
        Gitlab::ErrorTracking.log_exception(e)

        # Default to ALLOW if any errors occur
        [ALLOW, attribs, true]
      end
    end

non-admin, non-member user submits an issue to a public project
spam_verdict_service.rb requests verdict from Spamcheck
Spamcheck's evaluation leads to a spam value that passes the ALLOW and the CONDITIONAL_ALLOW threshold
Spamcheck returns DISALLOW or BLOCK depending of the final spam value
spam_verdict_service.rb doesn't override the non-ALLOW result to be CONDITIONAL_ALLOW
user is not presented with reCAPTCHA
reCAPTCHA is not solved since not presented
submission of issue rejected

How to set up and validate locally

reCAPTCHA

Add a site and get a reCAPTCHA key and secret over at https://www.google.com/recaptcha/admin/.
Add localhost, gdk.test and/or gitlab.local as valid domains for that site

Spamcheck

clone https://gitlab.com/gitlab-com/gl-security/engineering-and-research/automation-team/spam/spamcheck/
docker-compose up --build

GitLab (GDK)

enable reCAPTCHA
enable Spamcheck

select a public project
login as/impersonate a non-admin user that is not a member of said project
submit a spam-like issue
notice Spamcheck's verdict (DISALLOW/BLOCK) as per the logs, GitLab's reception of said verdict in gitlab/log/application_json.log and verify that no reCAPTCHA is presented to the user

spamcheck_1  | {"level":"info","metric":"spamcheck_inspector_verdicts","msg":"Inspector verdict received","time":"2021-09-30T17:16:55Z","verdict":1}
spamcheck_1  | {"correlation_id":"01FGVVB0YQNK9DJ53WKPW6NPGB","email_allowlisted":false,"level":"info","metric":"spamcheck_verdicts","monitor_mode":"false","msg":"Verdict calculated","project_path":"h5bp/html5-boilerplate","time":"2021-09-30T17:16:55Z","verdict":"BLOCK"}
spamcheck_1  | {"error":null,"failed":false,"level":"info","method":"CheckForSpamIssue","metric":"spamcheck_api_calls","msg":"","time":"2021-09-30T17:16:55Z"}

{"severity":"INFO","time":"2021-09-30T17:16:55.489Z","correlation_id":"01FGVVB0YQNK9DJ53WKPW6NPGB","class":"Spam::SpamVerdictService","akismet_verdict":"allow","spam_check_verdict":"block","extra_attributes":{"monitorMode":"false"},"spam_check_rtt":0.16151539899874479,"final_verdict":"block","username":"reported_user_22","user_id":44,"target_type":"Issue","project_id":8}

MR acceptance checklist

This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.

I have evaluated the MR acceptance checklist for this MR.

cc @cwoolley-gitlab @cablett @stanhu @jwanjohi @eurie

Stop overriding spamcheck verdicts !=ALLOW to allow rescuing via reCAPTCHA.