Skip to content

Improve performance of user contribution graph query

Heinrich Lee Yu requested to merge improve-calendar-json-query into master

What does this MR do and why?

Some contribution graphs are very slow. Specially for users that have many years of contributions. This is mainly due to the inefficient filtering on created_at.

Sample URLs:

This MR improves this by doing 3 things:

  1. Simplifies the query so that we GROUP BY date outside the UNION and return the totals instead of aggregating in Ruby. This also removes extra columns we're selecting that are never used. This results in the query returning at most 366 rows instead of 1 row per target type, action, and project.

  2. Avoids the use of the projects table to get the project_id since we can use project_features.project_id. This change moves the scopes to the ProjectFeature model so we can use it without joining with Project.

  3. Improves the index by adding action, target_type, and created_at.

    Before, it was doing something like this:

     ->  Index Scan using index_events_on_author_id_and_project_id on public.events  (cost=0.58..3.60 rows=1 width=12) (actual time=0.376..1.098 rows=51 loops=120)
         Index Cond: ((events.author_id = 64248) AND (events.project_id = projects.id))
         Filter: ((events.created_at >= '2020-11-22 00:00:00+00'::timestamp with time zone) AND (events.created_at <= '2021-11-22 23:59:59.999999+00'::timestamp with time zone) AND (events.action = 5))
         Rows Removed by Filter: 599
         Buffers: shared hit=73456
         I/O Timings: read=0.000 write=0.000

    With this new index:

    ->  Index Only Scan using index_events_on_author_id_and_project_id_action_target_type_cre on public.events  (cost=0.70..2.27 rows=1 width=12) (actual time=0.010..0.045 rows=51 loops=120)
        Index Cond: ((events.author_id = 64248) AND (events.project_id = project_features.project_id) AND (events.action = 5) AND (events.target_type IS NULL) AND (events.created_at >= '2020-11-22 00:00:00+00'::timestamp with time zone) AND (events.created_at <= '2021-11-22 23:59:59.999999+00'::timestamp with time zone))
        Heap Fetches: 61
        Buffers: shared hit=2190
        I/O Timings: read=0.000 write=0.000

Migration output

== 20211122033501 ImproveIndexOnEventsForCalendar: migrating ==================
== 20211122033501 ImproveIndexOnEventsForCalendar: migrated (0.0026s) =========

== 20211122033501 ImproveIndexOnEventsForCalendar: reverting ==================
== 20211122033501 ImproveIndexOnEventsForCalendar: reverted (0.0023s) =========

MR acceptance checklist

This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.

Edited by Heinrich Lee Yu

Merge request reports