Provide mechanism to load GraphQL with all dependencies only when needed
During the GitLab on 2GB research week groupmemory identified several gems that contribute significantly to overall memory use, yet are only useful for specific kinds of work loads. These include (but are not limited to:)
- GraphQL
- Grape (REST API)
- Rouge (syntax highlighting)
- GetText (translations)
Some of these require tens of MB just by being required: https://gitlab.com/gitlab-org/memory-team/memory-team-2gb-week/-/issues/4#note_449831973.
We also discovered that we load/require 14480
files, https://gitlab.com/gitlab-org/memory-team/memory-team-2gb-week/-/issues/9#note_452530513 when we start Gitlab. 1274
files belong to Graphql. This means that if we don't load 1274
application files and all related Graphql gems when we don't need them (Sidekiq), we could save a lot of memory.
A big caveat here is that something like "run only on API nodes" will not benefit smaller Omnibus setups, since those do not discriminate between web and API workers (we only do that for .com). However, Sidekiq would likely greatly benefit from excluding unused gems and unused application code in all deployments.
Note It's important to note that it's not just about loading gems, but also application code of GitLab. The goal is to provide a mechanism to selectively load a part of the application with all required gems as well!
With a generic mechanism, the long term goal would be to Split application into functional parts to ensure that only needed code is loaded with all dependencies.
Proposed Solution
1. Introduce new Rails Engine for each application context
The proposed mechanism
will use Rails engines, in order to split the application into the smaller functional parts that will allow as that only needed code is loaded with all dependencies.
With a full
engine type, the parent application inherits the routes from the engine, there is no namespacing of models, controllers, etc. If the engine is required, these are immediately accessible to the parent application, as they were before.
Code Structure
We will introduce the engines
folder, which will contain separate engines for different functionalities/contexts.
One of the engines would be the web_engine
that will contain all the application code, gems, and routes needed for the Web
context ( API/ActionCable/Controllers).
We should move all web
specific code here, from the /app
, /lib
, ee/app
, and ee/lib
folders such as code related to the GraphQL, Grape, Controllers, ...
For each context that we would like to selectively load, we should create a separate engine.
Engine Gemfile
Move all gems that are required to successfully run the code inside the engine to the actual engine Gemfile
Gem::Specification.new do |spec|
spec.add_dependency 'graphql', '~> 2.0.2'
spec.add_dependency 'graphiql-rails', '~> 1.4.10'
end
Engine routes
Move all routes that are only used by the engine to the Engine engines/<engine_name>/config/routes.rb
file
Rails.application.routes.draw do
post '/api/graphql', to: 'graphql#execute'
mount GraphiQL::Rails::Engine, at: '/-/graphql-explorer', graphql_path:
Gitlab::Utils.append_path(Gitlab.config.gitlab.relative_url_root, '/api/graphql')
end
Engine initializers
Move all initializers that are only used by the engine to the Engine engines/<engine_name>/config/initializers
folder
# graphql.rb initializer
GraphQL::ObjectType.accepts_definitions(authorize: GraphQL::Define.assign_metadata_key(:authorize))
GraphQL::Field.accepts_definitions(authorize: GraphQL::Define.assign_metadata_key(:authorize))
GraphQL::Schema::Object.accepts_definition(:authorize)
GraphQL::Schema::Field.accepts_definition(:authorize)
2. Connect Gitlab application with the Engine
Gitlab Gemfile
In Gitlab Gemfile.rb
, add engine to the engines
group
# Gemfile
group :engines, :test do
gem 'web_engine', path: 'engines/web_engine'
end
Since the gem is inside :engines
group, it will not be automatically required by default.
Configure Gitlab when to load the engine
In Gitlab config/engines.rb
, we can configure when do we want to load our engines by relying on our Gitlab::Runtime
# config/engines.rb
# Load only in case we are running web_server or rails console
if Gitlab::Runtime.web_server? || Gitlab::Runtime.console?
require 'web_engine'
end
Engine configuration
Our Engine
inherits from the Rails::Engine
class. This way this gem notifies Rails that there's an engine at the specified path so it will correctly mount the engine inside the application, performing tasks such as adding the app directory of the engine to the load path for models, mailers, controllers, and views.
A file at lib/web_engine/engine.rb, is identical in function to a standard Rails application's config/application.rb file. This way engines
can access a config
object which contains configuration shared by all railties
and the application
. Additionally, each engine can access autoload_paths
, eager_load_paths
, and autoload_once_paths
settings which are scoped to that engine.
module WebEngine
class Engine < ::Rails::Engine
config.eager_load_paths.push(*%W[#{config.root}/lib
#{config.root}/app/graphql/resolvers/concerns
#{config.root}/app/graphql/mutations/concerns
#{config.root}/app/graphql/types/concerns])
if Gitlab.ee?
ee_paths = config.eager_load_paths.each_with_object([]) do |path, memo|
ee_path = config.root
.join('ee', Pathname.new(path).relative_path_from(config.root))
memo << ee_path.to_s
end
# Eager load should load CE first
config.eager_load_paths.push(*ee_paths)
end
end
end
3. Test with the Engine
- Move spec files to the
engines/web_engine/spec
folder - Move ee/spec files to the
engines/web_engine/ee/spec
folder - Control specs from main application using environment variable
TEST_WEB_ENGINE
- Add CI job that will run engines/web_engine/spec tests separately using TEST_WEB_ENGINE env variable.
- Add CI job that will run engines/web_engine/ee/spec tests separately using TEST_WEB_ENGINE env variable.
- Run whitebox frontend tests with TEST_WEB_ENGINE=true
Outcome
Proposal for implementation for next release via architectural blueprint
- Evolution Workflow | GitLab - https://about.gitlab.com/handbook/engineering/architecture/workflow/