Skip to content

Improve delete_software_license_policies query

Sashi Kumar Kumaresan requested to merge sk/fix-17168-inc into master

What does this MR do and why?

This MR improves the performance of delete query to include the scan_result_policy_reads of the concerned project only to reduce the number of IDs filtered. This MR also changes the query to sub-query instead of IN query.

This is a corrective action of Incident: gitlab-com/gl-infra/production#17168 (closed)

The security_orchestration_policy_configuration_id that caused the slow query is scoped to a namespace that has 54706 project. And each project has only 1 rows of scan_result_policies but since we do policy_configuration. scan_result_policy_reads.pluck(:id) it selects 54701 IDs.

More context on the data: gitlab-com/gl-infra/production#17168 (comment 1663027970)

Database

Before

SELECT
    "scan_result_policies"."id" 
FROM
    "scan_result_policies" 
WHERE
    "scan_result_policies"."security_orchestration_policy_configuration_id" = 1029627


SELECT
    "software_license_policies".* 
FROM
    "software_license_policies" 
WHERE
    "software_license_policies"."project_id" = 24583235 
    AND "software_license_policies"."scan_result_policy_id" IN (10324607, 10326279, 10324491, 10326623, 10324468, 10326230, 10324484, 10326628, 10324495, 10326710, 10324552, 10326753, 10324563, 10326804, 10324572, 10326875, 10324586, 10326412, 10324744, 10326386, 10324627, 10326702, 10324675, 10326810, 10324550, 10327045, 10324597, 10326937, 10324913, 10327059, 10324493, 10326617, 10324669, 10327301, 10324615, 10326282, 10324490, 10326621, 10324472, 10326246, 10324469, 10326235, 10324467, 10326553, 10324588, 10326624, 10324473, 10326227, 10324482, 10326228, 10324537, 10327189, 10324505, 10326631, 10324576, 10327187, 10324585, 10326312, 10324489, 10326604, 10324511, 10326644)

After

SELECT
    "software_license_policies".* 
FROM
    "software_license_policies" 
WHERE
    "software_license_policies"."project_id" = 43110842 
    AND "software_license_policies"."scan_result_policy_id" IN (
        SELECT
            "scan_result_policies"."id" 
        FROM
            "scan_result_policies" 
        WHERE
            "scan_result_policies"."security_orchestration_policy_configuration_id" = 4261 
            AND "scan_result_policies"."project_id" = 43110842

Query Plan : https://postgres.ai/console/gitlab/gitlab-production-tunnel-pg12/sessions/24184/commands/77181

 Nested Loop  (cost=0.85..18.21 rows=1 width=40) (actual time=16.599..266.054 rows=456 loops=1)
   Buffers: shared hit=149 read=309 dirtied=185
   I/O Timings: read=257.544 write=0.000
   ->  Index Scan using index_scan_result_policies_on_position_in_configuration on public.scan_result_policies  (cost=0.43..3.45 rows=1 width=8) (actual time=8.047..8.049 rows=1 loops=1)
         Index Cond: ((scan_result_policies.security_orchestration_policy_configuration_id = 4261) AND (scan_result_policies.project_id = 43110842))
         Buffers: shared hit=4 read=3
         I/O Timings: read=7.941 write=0.000
   ->  Index Scan using idx_software_license_policies_unique_on_project_and_scan_policy on public.software_license_policies  (cost=0.42..14.76 rows=1 width=40) (actual time=8.544..257.659 rows=456 loops=1)
         Index Cond: ((software_license_policies.project_id = 43110842) AND (software_license_policies.scan_result_policy_id = scan_result_policies.id))
         Buffers: shared hit=145 read=306 dirtied=185
         I/O Timings: read=249.603 write=0.000

Time: 268.676 ms  
  - planning: 2.321 ms  
  - execution: 266.355 ms  
    - I/O read: 257.544 ms  
    - I/O write: 0.000 ms  
  
Shared buffers:  
  - hits: 149 (~1.20 MiB) from the buffer pool  
  - reads: 309 (~2.40 MiB) from the OS file cache, including disk I/O  
  - dirtied: 185 (~1.40 MiB)  
  - writes: 0  

MR acceptance checklist

This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.

Edited by Alan (Maciej) Paruszewski

Merge request reports