Draft: Add Organization Isolation queries on organization-sharded models
What does this MR do and why?
This MR will modify database queries: it will scope queries to the Current.organization.
The reason for this MR: for organization isolation, we need to scope all queries to the current organization.
There are several options for this:
- Post processing using
pg_querygem: !204542 (closed) this means that the Arel AST is changed into SQL and then parsed into a PGQuery AST, this seems redundant work - Rails scopes
- Use some kind of proxy for PostgreSQL, next to or in place of pg bouncer
Example
# No Current.organization defined, so no query manipulation
[4] pry(main)> puts Project.first.organization.name
Default
# SELECT <columns> FROM "projects" ORDER BY "projects"."id" ASC LIMIT 1
# Use a new organization for Current.organization
[5] pry(main)> Current.organization = Organizations::Organization.find_or_create_by!(name: 'my org', path: 'my-org')
# This is the proof: we now want to retrieve the project that belongs to organization_id = 1.
# But we set the Current.organization to "my org". This Project does not belong to that organization
# So for the code, it seems like the Project does not exist and it will raise an error:
[6] pry(main)> puts Project.first.organization.name
NoMethodError: undefined method 'organization' for nil
# SELECT <columns> FROM "projects" WHERE "projects"."organization_id" = 1000 ORDER BY
"projects"."id" ASC LIMIT 1
# We can 'move' the project to the newly created organization:
[7] pry(main)> Project.where(id: 1).update_all(organization_id: Current.organization.id)
# And now, the organization is 'my-org' and that is the Current.organization so the project is found
[8] pry(main)> puts Project.first.organization.name
my-org
# Bypass organization scoping: no change in SQL
[9] pry(main)> Gitlab::Database::Organizations::Scope.without_organization_scope{ puts Project.first.organization.name }
Default
How it works
The code is implemented as an extension to the existing ActiveRecord logic. ActiveRecord builds queries using the Arel class. The Arel logic is encapsulated in the ActiveRecord::Relation class. The ActiveRecord::Relation class has a method arel. This method returns an Arel AST. We can patch this method:
If we should add an organization scope, it will return a modified Arel AST: the FROM-part of the AST will have an additional 'WHERE organization_id = XXX' clause
If we should not add a scope, the original Arel AST is returned. This will happen when:
- No Current.organization is set
- When the table has no sharding key that references
organizationstable. - When the table has multiple sharding keys (Snippets for example)
- When the logic is wrapped in a
Gitlab::Database::Organizations::Scope.without_organization_scopeblock
UseSqlFunctionForPrimaryKeyLookups Some queries use a PostgreSQL function: this function doesn't know about organization_id. I added 3 new functions that support organization_id. #429479 (closed)
Performance
Arel AST manipulation
Impact on the application logic is minimal. Retrieving a random group 10000 times:
user system total real
Without current_org 45.782577 1.758297 47.540874 ( 47.645114)
With current_org 45.544311 1.011991 46.556302 ( 46.729764)
TODO
- Add Feature flag. Would be nice if we could enable this for organization_id > 1. So existing operation won't be affected. Only Organizations that are new.
- Current implementation is only adding WHERE organization=X to table(s) that are in the FROM clause.
- Less changes to our queries: saver to rollout
- Joins are not scoped
- CTE Support
- Performance impact:
- Database queries
References
Related to #582272
MR acceptance checklist
Evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.