Skip to content

WIP: JSONB settings for emails on push service

Andy Schoenen requested to merge services_jsonb_settings into master

What does this MR do?

This is an experiment to see if we can implement mass integrations on the database level. Mass integrations mean that we want to be able to activate services on the instance, group and project level. The services would inherit settings from the next level field by field. When a field on the lower level is NULL that means the value will be inherited.

Based on the idea of @abrandl, I added a jsonb field to the services table that should hold all the settings. Then we would add a database view that orders the service and its ancestors by level and merges the settings.

with
recursive r AS (
  select *, 1 as level from integrations where id = 4
  UNION ALL
  select integrations.*, level + 1 as level from r JOIN integrations ON r.parent_id = integrations.id
),
all_levels AS (

  SELECT * FROM r order by level desc
)
select jsonb_merge(settings) from all_levels;

Here is an example of how the result of the view looks compared the actual data in the services table:

Query: SELECT project_id, settings FROM integrations WHERE project_id IS NOT NULL;

Result:

project_id settings
1 { url: 'example.com', username: 'group', password: '123' }
2 { url: 'example.com', username: 'admin', password: '321' }
3 { url: 'foo.com', username: 'project', password: '5678' }

Project 1 inherits from the group level with overwritten username and password, project 2 inherits all settings from the instance level and project 3 overwrites all the settings.

Actual data in the table:

description project_id group_id instance settings
Instance level service NULL NULL TRUE { url: 'example.com', username: 'admin', password: '123' }
group level service NULL 1 FALSE { username: 'group', password: '321' }
project level service. Project in group 1 1 NULL FALSE { }
project level service. Project without a group 2 NULL FALSE { }
project level service. Project without a group 3 NULL FALSE { url: 'foo.com', username: 'project', password: '5678' }

Findings

  1. There is no parent_id field on the services table. If there is an instance level service, the highest level ancestor is the service where instance IS TRUE. The other levels can be joined over the projects and groups table. This query is quite complex.
  2. The services table already has a column called properties which stores part of the settings in JSON. Unfortunately we can not use this because it is of type text but in order to merge json we need jsonb.
  3. Services using properties, can be easily moved to a new column.
  4. We also need to move event settings into the new column. Currently, there is a column for each event type.
  5. In the example, the recursive function starts from one specific service and finds all the ancestors. If we want to query the view though, we need to start from the top level service and then build the table all the way down to the lowest level services. Example query SELECT * FROM integrations WHERE settings->>'push_events' = 1;. This is probably slow.
  6. We need to figure out how to do encryption in jsonb fields if we want to use this for services like Jira

Performance

Select from view:

gitlabhq_development=# EXPLAIN SELECT * FROM integrations WHERE project_id = 1;
                                                                                QUERY PLAN                                                                                
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 CTE Scan on recursive_services  (cost=74.64..76.91 rows=1 width=112)
   Filter: (project_id = 1)
   CTE services_with_parent
     ->  Hash Left Join  (cost=28.68..42.37 rows=100 width=80)
           Hash Cond: ((services.type)::text = (instance.type)::text)
           ->  Hash Left Join  (cost=17.05..29.37 rows=100 width=80)
                 Hash Cond: (((services.type)::text = (parent.type)::text) AND (CASE WHEN (groupo.type IS NULL) THEN '-1'::integer ELSE groupo.id END = parent.group_id))
                 ->  Hash Left Join  (cost=4.55..16.11 rows=100 width=86)
                       Hash Cond: (project.namespace_id = groupo.id)
                       ->  Hash Left Join  (cost=2.41..13.68 rows=100 width=80)
                             Hash Cond: (services.project_id = project.id)
                             ->  Seq Scan on services  (cost=0.00..11.00 rows=100 width=76)
                             ->  Hash  (cost=2.18..2.18 rows=18 width=8)
                                   ->  Seq Scan on projects project  (cost=0.00..2.18 rows=18 width=8)
                       ->  Hash  (cost=1.51..1.51 rows=51 width=10)
                             ->  Seq Scan on namespaces groupo  (cost=0.00..1.51 rows=51 width=10)
                 ->  Hash  (cost=11.00..11.00 rows=100 width=40)
                       ->  Seq Scan on services parent  (cost=0.00..11.00 rows=100 width=40)
           ->  Hash  (cost=11.00..11.00 rows=50 width=36)
                 ->  Seq Scan on services instance  (cost=0.00..11.00 rows=50 width=36)
                       Filter: (instance IS TRUE)
   CTE recursive_services
     ->  Recursive Union  (cost=0.00..32.27 rows=101 width=112)
           ->  CTE Scan on services_with_parent  (cost=0.00..2.00 rows=1 width=112)
                 Filter: (parent_id IS NULL)
           ->  Hash Join  (cost=0.33..2.83 rows=10 width=112)
                 Hash Cond: (services_with_parent_1.parent_id = recursive_services_1.id)
                 ->  CTE Scan on services_with_parent services_with_parent_1  (cost=0.00..2.00 rows=100 width=80)
                 ->  Hash  (cost=0.20..0.20 rows=10 width=36)
                       ->  WorkTable Scan on recursive_services recursive_services_1  (cost=0.00..0.20 rows=10 width=36)
(30 rows)

Select from services table to compare cost

gitlabhq_development=# EXPLAIN SELECT * FROM services WHERE project_id = 1;
                                               QUERY PLAN                                               
--------------------------------------------------------------------------------------------------------
 Index Scan using index_services_on_project_id_and_type on services  (cost=0.14..2.16 rows=1 width=723)
   Index Cond: (project_id = 1)
(2 rows)
Edited by 🤖 GitLab Bot 🤖

Merge request reports