Skip to content
GitLab
Next
    • GitLab: the DevOps platform
    • Explore GitLab
    • Install GitLab
    • How GitLab compares
    • Get started
    • GitLab docs
    • GitLab Learn
  • Pricing
  • Talk to an expert
  • /
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
    Projects Groups Topics Snippets
  • Register
  • Sign in
  • GitLab GitLab
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributor statistics
    • Graph
    • Compare revisions
    • Locked files
  • Issues 50,383
    • Issues 50,383
    • List
    • Boards
    • Service Desk
    • Milestones
    • Iterations
    • Requirements
  • Merge requests 1,559
    • Merge requests 1,559
  • CI/CD
    • CI/CD
    • Pipelines
    • Jobs
    • Artifacts
    • Schedules
    • Test cases
  • Deployments
    • Deployments
    • Environments
    • Releases
  • Packages and registries
    • Packages and registries
    • Package Registry
    • Container Registry
    • Infrastructure Registry
  • Monitor
    • Monitor
    • Metrics
    • Incidents
  • Analytics
    • Analytics
    • Value stream
    • CI/CD
    • Code review
    • Insights
    • Issue
    • Repository
  • Snippets
    • Snippets
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
Collapse sidebar
  • GitLab.orgGitLab.org
  • GitLabGitLab
  • Issues
  • #273618
Closed
Open
Issue created Oct 29, 2020 by Michael Friedrich@dnsmichi🌈Developer

Spam prevention: Option to prevent URL following for untrusted users (lower rating for search engine crawlers)

Release notes

Problem to solve

Spam is a hard battle to fight against. I had experienced that myself and one of the measures were *captchas, but they were not enough. We migrated a forum software to Discourse some years ago, and learned that their spam prevention system uses Akismet to check potential users (trust level 0).

This system cannot actively prevent spam though. One bot pattern is to post URLs to get a better search engine ranking. You can control this behaviour with rel=nofollow and rel=follow as HTML tags.

Discourse uses this method to only allow follow for trusted users.

The idea originates from a HackerNews discussion: https://news.ycombinator.com/item?id=24924626

My "old" thoughts which came too big with a trust level and gamification system are here: #14156 (comment 258252735) cc @heather @JohnathanHunt @sytses

Intended users

  • Sidney (Systems Administrator)

User experience goal

Decrease spam bots registering and creating issues, as their content URLs are not followed anymore (and not indexed by crawlers).

Proposal

Add rel=nofollow to all URLs posted by users who are

  • not in the group / organization
  • not at least reporter/developer

This makes search engine crawlers to ignore the URL and not index the relation. Bots will learn their in-effectiveness and pick other targets.

This applies to any Markdown content which gets rendered as URL.

  • Descriptions in Issues/MRs
  • Comments
  • Wiki
  • Snippets

Further details

This may need changes to our Markdown rendering engine.

A performance decrease is possible, this is effectively to determine for large scale instances.

Therefore I recommend to make this an option on the instance level for administrators, disabled by default.

Permissions and Security

Documentation

Settings and descriptions for security: https://docs.gitlab.com/ee/security/README.html

It may need a note for troubleshooting too.

Availability & Testing

  • Performance impact

What does success look like, and how can we measure that?

Less spam reports from public self-hosted instances. Positive impact on GitLab.com when enabled, and bots learning that their URLs are not followed anymore.

What is the type of buyer?

This should be a Core feature available for everyone.

Is this a cross-stage feature?

Manage Access for enabling the setting, and the URL renderer in the backend. It may touch editor (snippets) and wiki too.

Links / references

Edited Oct 29, 2020 by Michael Friedrich
Assignee
Assign to
Time tracking