Refactor UserDetail to use Sanitizable concern (!208943) · Merge requests · GitLab.org / GitLab

Please check this box if this contribution uses AI-generated content (including content generated by GitLab Duo features) as outlined in the GitLab DCO & CLA. As a benefit of being a GitLab Community Contributor, you receive complimentary access to GitLab Duo.

What does this MR do and why?

Migrates the UserDetail model from the deprecated Sanitize.clean() method to the Sanitizable concern using the modern Sanitize.fragment() API. This refactoring improves code consistency by aligning with patterns already used in other models like NamespaceSetting and ApplicationSetting.

Key changes:

Migrated 8 user profile attributes (bluesky, discord, linkedin, mastodon, orcid, twitter, website_url, github) to use the sanitizes! declaration from the Sanitizable concern
Retained custom sanitization for location and organization fields to preserve special ampersand handling behavior (& → &)
Added security validations for pre-escaped HTML entities, double-encoded data, and path traversal attempts
Updated from deprecated Sanitize.clean() to modern Sanitize.fragment() API

Behavioral changes:

Double-encoding prevention: The Sanitizable concern uses CGI.unescapeHTML() to prevent double-encoding, so ampersands now remain unencoded rather than being converted to & entities. This is the more correct behavior.

New security validations: The following malicious patterns are now rejected with validation errors:

Attack Type	Example Input	Validation Error	Why Safe to Reject
Pre-escaped HTML entities (XSS bypass attempt)	`<script>alert(1)</script>`	"cannot contain escaped HTML entities"	Legitimate users don't manually encode HTML entities. This pattern indicates an attempt to bypass frontend sanitization by pre-encoding malicious scripts that might be decoded and executed later.
Double-encoded data (Encoding evasion)	`%2526lt%253Bscript%2526gt%253B`	"cannot contain escaped components"	Legitimate URLs and social media usernames don't contain double-encoded characters. This pattern indicates an attempt to evade sanitization through multiple encoding layers, hoping one layer gets decoded while bypassing filters.
Path traversal (Directory traversal attack)	`main../../../../../../api/v4/projects/1`	"cannot contain a path traversal component"	Legitimate social media handles, websites, and profile URLs never contain `../` sequences. This pattern indicates an attempt to access unauthorized file paths or resources through directory traversal.

These values were previously sanitized silently but are now explicitly rejected, providing better security feedback and preventing malicious data from being stored in the database.

🛠️ with ❤️ at Siemens

References

Related to issue #577060 (closed)
Sanitizable concern: app/models/concerns/sanitizable.rb
Similar implementations: NamespaceSetting and ApplicationSetting models

Screenshots or screen recordings

N/A - Backend refactoring with no UI changes

How to set up and validate locally

Open Rails console: bundle exec rails console

Test basic sanitization works:

user = User.first
ud = user.user_detail
ud.linkedin = '<script>alert("xss")</script>test'
ud.save
ud.linkedin # Should output: 'test' (script removed)

Test new security validations reject malicious input:

ud.twitter = '&lt;script&gt;alert(1)&lt;/script&gt;'
ud.valid? # Should be false
ud.errors.full_messages # Should include: "Twitter cannot contain escaped HTML entities"

ud.github = '../../../etc/passwd'
ud.valid? # Should be false
ud.errors.full_messages # Should include: "Github cannot contain a path traversal component"

Test special ampersand handling for location/organization:

ud.location = 'San Francisco & Bay Area'
ud.save
ud.location # Should output: 'San Francisco & Bay Area' (ampersand preserved, not encoded)

Run the test suite:

bundle exec rspec spec/models/user_detail_spec.rb

MR acceptance checklist

Please evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.

MR Checklist (@gerardo-navarro)

Edited Oct 15, 2025 by 🤖 GitLab Bot 🤖

Refactor UserDetail to use Sanitizable concern

What does this MR do and why?

References

Screenshots or screen recordings

How to set up and validate locally

MR acceptance checklist

Merge request reports