Skip to content

Change JiraTrackerData#deployment_type based on URL

What does this MR do?

After fixing #324251 (closed), we had a customer report that their JIRA integration stopped working. Sentry event confirmed this to come from the fact that the project's corresponding JiraService's JiraTrackerData#deployment_type was set to 'unknown'. This background migration will set deployment_type for rows where deployment_type is unknown based on the url field.

We have around 90k records for this table where deployment_type is unknown. The logic is simple – if URL ends in atlassian.net then most likely we're dealing with JIRA Cloud, otherwise it's JIRA Server.

Related #329530 (closed)

Migrations

bundle exec rails db:migrate
== 20210421163509 ScheduleUpdateJiraTrackerDataDeploymentTypeBasedOnUrl: migrating
-- Scheduling UpdateJiraTrackerDataDeploymentTypeBasedOnUrl jobs
-- Scheduled 1 UpdateJiraTrackerDataDeploymentTypeBasedOnUrl jobs with a maximum of 2500 records per batch and an interval of 120 seconds.

The migration is expected to take at least 120 seconds. Expect all jobs to have completed after 2021-05-20 10:14:49 UTC."
== 20210421163509 ScheduleUpdateJiraTrackerDataDeploymentTypeBasedOnUrl: migrated (0.0240s)
bundle exec rails db:migrate:down VERSION=20210421163509
== 20210421163509 ScheduleUpdateJiraTrackerDataDeploymentTypeBasedOnUrl: reverting
== 20210421163509 ScheduleUpdateJiraTrackerDataDeploymentTypeBasedOnUrl: reverted (0.0000s)

Timings

Selecting a batch

https://postgres.ai/console/gitlab/gitlab-production-tunnel-pg12/sessions/4132/commands/14399

Query
SELECT * FROM jira_tracker_data WHERE deployment_type = 0 AND id BETWEEN 1 AND 2500
Plan
Index Scan using jira_tracker_data_pkey on public.jira_tracker_data  (cost=0.42..1444.04 rows=1139 width=340) (actual time=49.666..700.757 rows=21 loops=1)
   Index Cond: ((jira_tracker_data.id >= 1) AND (jira_tracker_data.id <= 2500))
   Filter: (jira_tracker_data.deployment_type = 0)
   Rows Removed by Filter: 2189
   Buffers: shared hit=568 read=838 dirtied=99
   I/O Timings: read=687.760 write=0.000
Timings
Time: 707.502 ms
  - planning: 6.678 ms
  - execution: 700.824 ms
    - I/O read: 687.760 ms
    - I/O write: 0.000 ms

Shared buffers:
  - hits: 568 (~4.40 MiB) from the buffer pool
  - reads: 838 (~6.50 MiB) from the OS file cache, including disk I/O
  - dirtied: 99 (~792.00 KiB)
  - writes: 0

Updating a batch

Batch size of 2500: https://explain.depesz.com/s/tFlo (thanks @abrandl)

This is with a batch size of 250 – I can't get #database-lab to swallow a query with 2500 pairs that BulkUpdate generates

https://postgres.ai/console/gitlab/gitlab-production-tunnel-pg12/sessions/4132/commands/14414

Query
EXPLAIN WITH cte(cte_id, cte_deployment_type) AS MATERIALIZED (
VALUES (1, 1), (2, 1), (3, 1), (4, 1), (5, 1), (6, 1), (7, 1), (8, 1), (9, 1), (10, 1), (11, 1), (12, 1), (13, 1), (14, 1), (15, 1), (16, 1), (17, 1), (18, 1), (19, 1), (20, 1), (21, 1), (22, 1), (23, 1), (24, 1), (25, 1), (26, 1), (27, 1), (28, 1), (29, 1), (30, 1), (31, 1), (32, 1), (33, 1), (34, 1), (35, 1), (36, 1), (37, 1), (38, 1), (39, 1), (40, 1), (41, 1), (42, 1), (43, 1), (44, 1), (45, 1), (46, 1), (47, 1), (48, 1), (49, 1), (50, 1), (51, 1), (52, 1), (53, 1), (54, 1), (55, 1), (56, 1), (57, 1), (58, 1), (59, 1), (60, 1), (61, 1), (62, 1), (63, 1), (64, 1), (65, 1), (66, 1), (67, 1), (68, 1), (69, 1), (70, 1), (71, 1), (72, 1), (73, 1), (74, 1), (75, 1), (76, 1), (77, 1), (78, 1), (79, 1), (80, 1), (81, 1), (82, 1), (83, 1), (84, 1), (85, 1), (86, 1), (87, 1), (88, 1), (89, 1), (90, 1), (91, 1), (92, 1), (93, 1), (94, 1), (95, 1), (96, 1), (97, 1), (98, 1), (99, 1), (100, 1), (101, 1), (102, 1), (103, 1), (104, 1), (105, 1), (106, 1), (107, 1), (108, 1), (109, 1), (110, 1), (111, 1), (112, 1), (113, 1), (114, 1), (115, 1), (116, 1), (117, 1), (118, 1), (119, 1), (120, 1), (121, 1), (122, 1), (123, 1), (124, 1), (125, 1), (126, 1), (127, 1), (128, 1), (129, 1), (130, 1), (131, 1), (132, 1), (133, 1), (134, 1), (135, 1), (136, 1), (137, 1), (138, 1), (139, 1), (140, 1), (141, 1), (142, 1), (143, 1), (144, 1), (145, 1), (146, 1), (147, 1), (148, 1), (149, 1), (150, 1), (151, 1), (152, 1), (153, 1), (154, 1), (155, 1), (156, 1), (157, 1), (158, 1), (159, 1), (160, 1), (161, 1), (162, 1), (163, 1), (164, 1), (165, 1), (166, 1), (167, 1), (168, 1), (169, 1), (170, 1), (171, 1), (172, 1), (173, 1), (174, 1), (175, 1), (176, 1), (177, 1), (178, 1), (179, 1), (180, 1), (181, 1), (182, 1), (183, 1), (184, 1), (185, 1), (186, 1), (187, 1), (188, 1), (189, 1), (190, 1), (191, 1), (192, 1), (193, 1), (194, 1), (195, 1), (196, 1), (197, 1), (198, 1), (199, 1), (200, 1), (201, 1), (202, 1), (203, 1), (204, 1), (205, 1), (206, 1), (207, 1), (208, 1), (209, 1), (210, 1), (211, 1), (212, 1), (213, 1), (214, 1), (215, 1), (216, 1), (217, 1), (218, 1), (219, 1), (220, 1), (221, 1), (222, 1), (223, 1), (224, 1), (225, 1), (226, 1), (227, 1), (228, 1), (229, 1), (230, 1), (231, 1), (232, 1), (233, 1), (234, 1), (235, 1), (236, 1), (237, 1), (238, 1), (239, 1), (240, 1), (241, 1), (242, 1), (243, 1), (244, 1), (245, 1), (246, 1), (247, 1), (248, 1), (249, 1), (250, 1)
) UPDATE jira_tracker_data SET deployment_type = cte.cte_deployment_type FROM cte WHERE cte_id = id;
Plan
 ModifyTable on public.jira_tracker_data  (cost=3.54..829.13 rows=250 width=410) (actual time=41.433..41.435 rows=0 loops=1)
   Buffers: shared hit=2983 read=56 dirtied=224
   I/O Timings: read=27.465 write=0.000
   CTE cte
     ->  Values Scan on "*VALUES*"  (cost=0.00..3.12 rows=250 width=8) (actual time=0.002..0.250 rows=250 loops=1)
           I/O Timings: read=0.000 write=0.000
   ->  Nested Loop  (cost=0.42..826.00 rows=250 width=410) (actual time=0.067..7.001 rows=227 loops=1)
         Buffers: shared hit=981
         I/O Timings: read=0.000 write=0.000
         ->  CTE Scan on cte  (cost=0.00..5.00 rows=250 width=40) (actual time=0.010..0.485 rows=250 loops=1)
               I/O Timings: read=0.000 write=0.000
         ->  Index Scan using jira_tracker_data_pkey on public.jira_tracker_data  (cost=0.42..3.28 rows=1 width=344) (actual time=0.025..0.025 rows=1 loops=250)
               Index Cond: (jira_tracker_data.id = cte.cte_id)
               Buffers: shared hit=981
               I/O Timings: read=0.000 write=0.000
Summary
Time: 42.116 ms
  - planning: 0.594 ms
  - execution: 41.522 ms
    - I/O read: 27.465 ms
    - I/O write: 0.000 ms

Shared buffers:
  - hits: 2983 (~23.30 MiB) from the buffer pool
  - reads: 56 (~448.00 KiB) from the OS file cache, including disk I/O
  - dirtied: 224 (~1.80 MiB)
  - writes: 0

Does this MR meet the acceptance criteria?

Conformity

Availability and Testing

Security

If this MR contains changes to processing or storing of credentials or tokens, authorization and authentication methods and other items described in the security review guidelines:

  • [-] Label as security and @ mention @gitlab-com/gl-security/appsec
  • [-] The MR includes necessary changes to maintain consistency between UI, API, email, or other methods
  • [-] Security reports checked/validated by a reviewer from the AppSec team
Edited by Michał Zając

Merge request reports