Skip to content

Introduce a class for safe file handling as an alternative to IO and File

Everyone can contribute. Help move this issue forward while earning points, leveling up and collecting rewards.

🔥 Problem

In the past, we faced some Application Security issues with path traversals that allowed 😈 users to get the contents of an arbitrary file in the system (gitlab.yml for example).

See

In one of the retrospectives, we discussed how we could improve things so that we can protect GitLab against those kind of vulnerabilities.

🚒 Solution

In the same way that we use Gitlab::HTTP to make safe web requests, have a class that can make safe(r) file requests.

  • Create class(es) that have the same interface as File and IO. Then add rubocop rules to prevent further introduction of direct calls to IO or File
  • Add a safe_read method that accepts a path to read and the valid folder location, and we check it against Gitlab::Utils#check_path_traversal! and #check_allowed_absolute_path!
  • Then we go through the code slowly transitioning from read(path) to safe_read(path, allowed_locations)

We could later do the same for write methods and readlines and all those sort of extra methods.

class Gitlab
  class IO
    def read(*args)
      super(args)
    end
    def safely_read(path, allowlist, *args)
      Gitlab::Utils.check_path_traversal!(path)
      Gitlab::Utils.check_allowed_absolute_path!(path, allowlist)
      raise if File.symlink?(path)
      super(path, args)
    end
  end
end
module Gitlab
  module ImportExport
    module Json
      class NdjsonReader
        MAX_JSON_DOCUMENT_SIZE = 50.megabytes

        attr_reader :dir_path

        def initialize(dir_path)
          @dir_path = dir_path
          @consumed_relations = Set.new
        end

        def exist?
          Dir.exist?(@dir_path)
        end

        def consume_attributes(importable_path)
          # This reads from `tree/project.json`
          path = file_path("#{importable_path}.json")

          raise Gitlab::ImportExport::Error, 'Invalid file' if !File.exist?(path) || File.symlink?(path)

---          data = File.read(path, MAX_JSON_DOCUMENT_SIZE)
+++          data = Gitlab::IO.read(path, dir_path, MAX_JSON_DOCUMENT_SIZE)
          json_decode(data)
        end

😿 Downsides

🎯 Coverage is GitLab code only

This protection wouldn't apply to dependencies, which is where a lot of file handling happens

GitLab-rails:

$ git grep -E '(File.read|IO.read)' | grep -v '_spec.rb' | grep -v 'spec/' | wc -l
     181

Gems used by GitLab-rails:

$ grep -R -E '(File.read|IO.read)' * | grep -v '_test.rb' | grep -v '_spec.rb'  | wc -l
    1500

😕 The caller might not know the allowed dirs

The allowlist might be dynamic. Who validates the dir? (E.g. in the diff example above).

We could have a deny list of paths though, to accompany (and override??...) the allowlist

  • /etc/*
  • "#{Rails.root}/config/*.yml"

🔮 Alternatives

Edited by 🤖 GitLab Bot 🤖