Skip to content

Implement Kubernetes allowed users/groups in Runner config

What does this MR do?

This MR implements administrator override capability for Kubernetes executor user configuration, addressing issue #38894 (closed). It provides administrators with granular control over container security using Kubernetes SecurityContext configuration, enabling per-container user/group specification with proper precedence and security isolation.

  • Implemented container-specific security contexts: build_container_security_context, helper_container_security_context, service_container_security_context
  • Administrator can now override user/group per container type with full control
  • Precedence logic: SecurityContext > Job Config (administrator always wins)
  • Build containers inherit job-level user configuration if no security context specified
  • Helper containers do NOT inherit job-level user configuration
  • Service containers inherit job-level user configuration if no security context specified
  • Each container type uses independent security context configuration
  • Restored original warning behavior for user/group validation failures

⚠️ Not Implemented (Out of Scope) - Requires a follow-up

Advanced Allowlist Syntax The original issue mentioned wanting advanced syntax like:

allowed_users = [">1", "1000-2000"]
allowed_groups = ["0", ">1000"]

This MR maintains the current simple array-based allowlist format:

allowed_users = ["1000", "1001", "65534"]
allowed_groups = ["1001", "65534"] 

The core issue was administrator override capability, not advanced syntax. The SecurityContext approach provides comprehensive administrative control while maintaining simplicity.

Docker also does not implement allowed_groups - Also requires a follow-up to bring them to parity.

Implementation Approach: SecurityContext vs Simple User Field

SecurityContext Validation Design Decision

Why SecurityContext Bypasses Allowlists:

  1. Administrator Privilege Model: SecurityContext values are configured in config.toml by system administrators with full server access
  2. Operational Requirements: Administrators need unrestricted override capability for compliance, security policies, and operational needs
  3. Clear Separation of Concerns:
    • Administrator SecurityContext: Unrestricted system-level configuration
    • Job Configuration: User-controlled, restricted by allowlists
  4. Consistent with Docker Executor: Docker's user field also bypasses allowed_users validation
  5. Kubernetes Native Approach: Leverages native SecurityContext without artificial restrictions

Warning Behavior

Original Behavior Preserved:

  • When user/group validation fails, warnings are logged with the original message format
  • Builds continue to use Kubernetes default security context when validation fails
  • No build failures from user validation - maintains backward compatibility
  • Warning message format: "Error parsing 'uid' from image options for container %q, using the configured security context: %v"
  • This message appears even when no actual security context is configured (original behavior)

Test Updates:

  • Updated integration tests to expect success with warnings instead of build failures
  • Fixed nil pointer dereference in integration test error handling
  • All security context tests now pass with original warning-only behavior

Implementation Comparison: Docker vs Kubernetes

Similarities (Consistent Admin Experience)

Feature Docker Kubernetes Status
Admin Override Full control via runner config Full control via SecurityContext Identical capability
Precedence Logic Runner > Job SecurityContext > Job Administrator always wins
Admin Allowlist Validation Runner config bypasses allowed_users SecurityContext bypasses all allowlists Consistent bypass behavior
Job Allowlist Validation Job config validated against allowed_users Job config validated against allowlists Consistent validation
Job Configuration Image user specification Image kubernetes user specification Consistent
Warning Behavior Warnings on validation failure Warnings on validation failure Consistent behavior

Differences (Platform & Architecture)

Aspect Docker Kubernetes Reason
Configuration Method Single user field Container-specific SecurityContext Granular control & security isolation
User Format Username OR UID Numeric UID only Kubernetes SecurityContext requirement
Group Support Not supported in user field Full numeric GID support Kubernetes SecurityContext capability
Helper Container Same as build container Independent security context Security isolation from job config
Granularity Single setting for all Per-container configuration Better security boundaries

End User Result Comparison

Docker Experience:

[runners.docker]
  user = "1000"                    # Override job user for all containers
  allowed_users = ["1000", "1001"] # Restrict allowed values

Kubernetes Experience:

[runners.kubernetes]
  allowed_users = ["1000", "1001"]           # Restricts job config only
  allowed_groups = ["1001", "65534"]         # Restricts job config only
  
  # Granular per-container control - BYPASSES allowlists
  [runners.kubernetes.build_container_security_context]
    run_as_user = 65534                       # Can be ANY user - bypasses allowlist
    run_as_group = 65534                      # Can be ANY group - bypasses allowlist
    
  [runners.kubernetes.helper_container_security_context]  
    run_as_user = 0                           # Even root allowed - bypasses allowlist
    run_as_group = 0                          # Even root allowed - bypasses allowlist

Result: Administrators get enhanced control capabilities with better security isolation through platform-native SecurityContext configuration.

Configuration Examples

Runner Configuration (config.toml)

Administrator Override with SecurityContext (Bypasses Allowlists)

[[runners]]
  name = "k8s-runner"
  url = "https://gitlab.example.com"
  executor = "kubernetes"
  [runners.kubernetes]
    allowed_users = ["1000", "1001"]               # Jobs restricted to these users
    allowed_groups = ["1001", "1002"]              # Jobs restricted to these groups
    
    # Build container security context - BYPASSES allowlist validation
    [runners.kubernetes.build_container_security_context]
      run_as_user = 65534                         # NOT in allowed_users - but bypasses validation
      run_as_group = 65534                        # NOT in allowed_groups - but bypasses validation
      
    # Helper container security context - BYPASSES allowlist validation
    [runners.kubernetes.helper_container_security_context]
      run_as_user = 0                             # Root access - bypasses validation
      run_as_group = 0                            # Root access - bypasses validation

Allowlist Only (No Admin Override - Job Config Validated)

[[runners]]
  name = "k8s-runner" 
  url = "https://gitlab.example.com"
  executor = "kubernetes"
  [runners.kubernetes]
    allowed_users = ["1000", "1001", "65534"]    # Jobs restricted to these users only
    allowed_groups = ["1001", "65534"]           # Jobs restricted to these groups only
    # No SecurityContext = job config validated against allowlists

Open Configuration (No Restrictions on Job Config)

[[runners]]
  name = "k8s-runner"
  url = "https://gitlab.example.com" 
  executor = "kubernetes"
  [runners.kubernetes]
    # No allowed_users = job config can specify any user  
    # No allowed_groups = job config can specify any group
    # No SecurityContext = jobs have full control within container capabilities

Job Configuration (.gitlab-ci.yml)

Job User Configuration

# Basic job with user specification
build:
  image:
    name: alpine:latest
    kubernetes:
      user: "1000:1001"  # User and group IDs
  script:
    - whoami
    - id

# Job with user only
test:
  image:
    name: alpine:latest
    kubernetes:
      user: "1000"       # User ID only
  script:
    - whoami

Expected Results & Precedence

Scenario 1: SecurityContext Override Active (Bypasses Allowlists)

# Runner config
[runners.kubernetes]
  allowed_users = ["1000", "1001"]     # SecurityContext IGNORES this
  allowed_groups = ["1001", "1002"]    # SecurityContext IGNORES this
  
  [runners.kubernetes.build_container_security_context]
    run_as_user = 65534                 # Not in allowlist - but BYPASSES validation
    run_as_group = 65534                # Not in allowlist - but BYPASSES validation
# Job config  
image:
  kubernetes:
    user: "1000:1001"                  # Would pass allowlist validation

Result: Build container runs as 65534:65534 (SecurityContext bypasses allowlist) Helper: Uses own helper_container_security_context - does NOT inherit job config Key: Administrator SecurityContext values are never validated against allowlists

Scenario 2: Allowlist Validation Success

# Runner config
[runners.kubernetes]
  allowed_users = ["1000", "65534"]
  allowed_groups = ["1001", "65534"]
  # No security context specified
# Job config
image:
  kubernetes:
    user: "1000:1001"  

Result: Build container runs as 1000:1001 (validation passes, job config used)

Scenario 3: Job Configuration Allowlist Violation (Warning Behavior)

# Runner config  
[runners.kubernetes]
  allowed_users = ["1000", "65534"]
  # No SecurityContext specified - job config will be validated
# Job config
image:
  kubernetes:
    user: "1001:1001"                  # 1001 not in allowed_users

Result:

  • Build succeeds with warning: "Error parsing 'uid' from image options for container "build", using the configured security context: validating UID: user "1001" is not in the allowed list: [1000 65534]"
  • Container uses Kubernetes default security context (RunAsUser/RunAsGroup not set)
  • Key: Validation failures generate warnings but do not fail builds

Scenario 4: Helper Container Isolation

# Runner config - only build container has security context
[runners.kubernetes.build_container_security_context]
  run_as_user = 1000
  run_as_group = 1001
# No helper_container_security_context specified
# Job config
image:
  kubernetes:
    user: "2000:2001"

Result:

  • Build container runs as 1000:1001 (SecurityContext override)
  • Helper container runs with default user (does NOT inherit job's 2000:2001)

Scenario 5: Administrator Root Access (SecurityContext Bypass)

# Runner config
[runners.kubernetes]
  allowed_users = ["1000", "1001"]           # Root (0) not allowed for jobs
  
  [runners.kubernetes.build_container_security_context]
    run_as_user = 0                           # Root - BYPASSES allowlist validation
    run_as_group = 0                          # Root - BYPASSES allowlist validation
# Job config (this would generate warnings)
image:
  kubernetes:
    user: "0:0"                              # Would warn: root not in allowed_users

Result: Build container runs as 0:0 (root) because SecurityContext bypasses validation Key: Administrator can grant root access via SecurityContext even when jobs cannot

Scenario 6: Job Configuration Root Protection (Warning Only)

# Runner config
[runners.kubernetes]
  allowed_users = ["1000", "1001"]           # Root (0) not in allowlist
  # No SecurityContext specified
# Job config
image:
  kubernetes:
    user: "0:0"                              # Root user attempt

Result:

  • Build succeeds with warning: "Error parsing 'uid' from image options for container "build", using the configured security context: validating UID: user "0" is not in the allowed list: [1000 1001]"
  • Container uses Kubernetes default security context
  • Key: Job configuration validation generates warnings, not failures

From Docker to Kubernetes Administrative Override

# Before (Docker)
[runners.docker]
  user = "1000"                        # Bypasses allowed_users validation
  allowed_users = ["1000", "1001"]     # Only validates job config

# After (Kubernetes SecurityContext equivalent)  
[runners.kubernetes]
  allowed_users = ["1000", "1001"]     # Only validates job config (same behavior)
  allowed_groups = ["1001", "65534"]   # Add group allowlist for job config
  
  # SecurityContext provides granular control with same bypass behavior
  [runners.kubernetes.build_container_security_context]
    run_as_user = 1000                  # Bypasses allowed_users (same as Docker)
    run_as_group = 1001                 # Enhanced: explicit group control
    
  [runners.kubernetes.helper_container_security_context]
    run_as_user = 1000                  # Bypasses allowed_users
    run_as_group = 1001                 # Enhanced: helper isolation

Migration Behavior: SecurityContext maintains the same bypass behavior as Docker's user field

Security Considerations

Administrator Override Model

  1. Unrestricted Administrator Control: SecurityContext values bypass all allowlist validation
    • Rationale: Administrators configure SecurityContext in runner config.toml
    • Assumption: Administrators have full system access and security responsibility
    • Provides unrestricted override capability for operational requirements

Job Configuration Protection

  1. User Restriction Maintained: Job configuration values are validated against allowlists
    • Job users/groups validated against allowed_users/allowed_groups arrays
    • Validation failures generate warnings but allow builds to continue
    • Uses Kubernetes default security context when validation fails
    • Maintains backward compatibility with existing warning behavior

Additional Security Features

  1. Container Isolation: Helper containers never inherit job user config, maintaining security boundaries
  2. Kubernetes Native: Uses native SecurityContext, following Kubernetes security best practices
  3. Numeric Only: Kubernetes requires numeric IDs, preventing username-based attacks
  4. Sentinel Values: Uses -1 to distinguish unset from root (0), preventing confusion

Security Model Summary

  • Administrator SecurityContext: Unrestricted (bypasses allowlists)
  • Job Configuration: Restricted (validated against allowlists with warnings)
  • Precedence: SecurityContext > Job Config (administrator always wins)
  • Validation Behavior: Warnings only, builds continue (maintains backward compatibility)

Related Issues & MRs

  • Fixes #38894 (closed) - Allow admins to override image:kubernetes:user value
  • Related to !5469 (merged) - Original user configuration implementation
Edited by Georgi N. Georgiev | GitLab

Merge request reports

Loading