Skip to content
GitLab
Next
    • GitLab: the DevOps platform
    • Explore GitLab
    • Install GitLab
    • How GitLab compares
    • Get started
    • GitLab docs
    • GitLab Learn
  • Pricing
  • Talk to an expert
  • /
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
    Projects Groups Topics Snippets
  • Register
  • Sign in
  • omnibus-gitlab omnibus-gitlab
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributor statistics
    • Graph
    • Compare revisions
    • Locked files
  • Issues 1,130
    • Issues 1,130
    • List
    • Boards
    • Service Desk
    • Milestones
    • Iterations
    • Requirements
  • Merge requests 88
    • Merge requests 88
  • Deployments
    • Deployments
    • Environments
    • Releases
  • Packages and registries
    • Packages and registries
    • Container Registry
  • Monitor
    • Monitor
    • Incidents
  • Analytics
    • Analytics
    • Value stream
    • Code review
    • Insights
    • Issue
    • Repository
  • Activity
  • Graph
  • Create a new issue
  • Commits
  • Issue Boards
Collapse sidebar
  • GitLab.orgGitLab.org
  • omnibus-gitlabomnibus-gitlab
  • Issues
  • #6259
Closed
Open
Issue created Jul 12, 2021 by Ethan Urie@eurie3️⃣Developer

Add Spamcheck anti-spam engine to omnibus-gitlab packages

Details

Request to include GitLab's Spamcheck anti-spam engine in omnibus-gitlab installations

It also includes a spam classifier, that is an obfuscated Python script along with a tensorflow model for classification.

  • URL:

    • Spam Classifier: https://gitlab.com/gitlab-com/gl-security/engineering-and-research/automation-team/ml-spam-detection/spam-classifier
    • Spamcheck: https://gitlab.com/gitlab-org/spamcheck
  • License:

    • Spam Classifier: Proprietary license, obfuscated code - https://gitlab.com/gitlab-com/gl-security/engineering-and-research/automation-team/ml-spam-detection/spam-classifier/-/blob/main/LICENSE
    • Spamcheck: https://gitlab.com/gitlab-org/spamcheck - MIT license
  • How does it integrate into GitLab (service, built-in feature)?

    • Spam classifier and spamcheck needs to be running. The former will listen over a socket. The latter will listen over two TCP endpoints (for GRPC and REST connections) for GitLab Rails to communicate with it.
    • On creating a new issue, GitLab Rails will communicate with Spamcheck via gRPC for a verdict on whether the issue is spam or ham.
    • Spamcheck communicates with spam-classifier to classify the incoming issue and returns a verdict to GitLab Rails (ALLOW, BLOCK, CONDITIONAL_ALLOW, DISALLOW, NOOP)
  • Does it need to run under a specific user or have specific permissions?: No

  • What are the concerns for running it behind a firewall, proxy, etc?: It is self-contained, but requires 3 ports (GRPC, REST and metrics endpoints) to be available.

  • Does it have any additional compilation or runtime requirements beyond what is already used within omnibus?

    • Spamcheck requires libtensorflow_lite for compilation
    • Spam-classifier requires Python 3.9 runtime for execution

Running (on the same node where GitLab runs)

Spam-classifier

  1. Download and extract the tarball from https://glsec-spamcheck-ml-artifacts.storage.googleapis.com/spam-classifier/0.2.0/linux.tar.gz

  2. Run the following command

    python3 dist/preprocess.py

Spamcheck (on a different terminal)

  1. Clone spamcheck repo and change to the target directory

  2. Ensure the dependencies are present

    1. Golang runtime
    2. make
    3. libtensorflow_lite - https://www.tensorflow.org/lite/guide/build_cmake
  3. Set GOPATH

    export GOPATH=${HOME}/go
  4. Update PATH to include Golang binary path

    export PATH="$PATH:$(go env GOPATH)/bin"
  5. Build the binary

    make build
  6. Copy example config

    cp config/config.toml.example config/config.toml
  7. Change modelPath in the config file to point to the model.tflite file from the extracted spam-classifier tarball.

  8. Run Spamcheck

    make run

Testing (on the node where GitLab runs)

Command line (on a different terminal)

  1. Create a file spam.json with the following content
    {
    "title": "fifa xxx porn stream fifa xxx porn stream",
    "description": "fifa xxx porn stream fifa xxx porn stream",
    "user_in_project": false,
    "project": {
        "project_id": 14,
        "project_path": "spamtest/hello"
    },
    "user": {
        "emails": [{"email": "mr_stupendous@hotmail.com", "verified": true}],
        "username": "MrStupendous",
        "org": "GitLab"
    },
    "created_at": "2021-01-01T10:00:00Z",
    "updated_at": "2021-01-01T11:00:00Z"
    }
  2. Create a file ham.json with the following content
    {
    "title": "Sign up page not working",
    "description": "Sign up page not working when accessed from mobile",
    "user_in_project": true,
    "project": {
        "project_id": 14,
        "project_path": "spamtest/hello"
    },
    "user": {
        "emails": [{"email": "mr_stupendous@hotmail.com", "verified": true}],
        "username": "MrStupendous",
        "org": "GitLab"
    },
    "created_at": "2021-01-01T10:00:00Z",
    "updated_at": "2021-01-01T11:00:00Z"
    }
  3. Download and install grpcurl
  4. Run the following commands
    # Pass the spam.json file to the endpoint and see `BLOCK` verdict
    $ grpcurl -plaintext -d "$(cat spam.json)" localhost:8001 spamcheck.SpamcheckService/CheckForSpamIssue
    
    # Pass the haam.json file to the endpoint and see `ALLOW` verdict
    $ grpcurl -plaintext -d "$(cat ham.json)" localhost:8001 spamcheck.SpamcheckService/CheckForSpamIssue

Web UI (on a different terminal)

  1. Go to Admin > Settings > Reporting page in the GitLab instance, and update the external spamcheck settings as follows:

    1. Check the Enable Spam Check via external API endpoint checkbox
    2. Use grpc://localhost:8001 as the URL
    3. No need to fill any API key

    image

  2. Create a project in the GitLab instance.

  3. As a different user (who is not a member of the project) create an issue in the project with the following text as subject and description: fifa xxx porn stream fifa xxx porn stream.

  4. See that issue creation has been blocked.

Edited Dec 14, 2021 by Balasankar 'Balu' C
Assignee
Assign to
Time tracking