Adjust issue sort order for better use of DB indexes

This is related to the comment in the MR: https://gitlab.com/gitlab-org/gitlab-ce/merge_requests/30542#note_191367165


When we're querying issues in a project, such as by created_at, we can either sort ascending or descending. It looks like when we sort descending, we use the full index. But when we sort ascending, we don't. Here's an example:

https://gitlab.com/gitlab-org/gitlab-ce/issues?sort=created_date

Using the performance bar, I can see the descending sort

2.305ms  SELECT "issues".* FROM "issues" WHERE "issues"."project_id" = 13083 AND ("issues"."state" IN ('opened')) ORDER BY "issues"."created_at" DESC, "issues"."id" DESC LIMIT 20 OFFSET 0

and the explain is

/chatops run explain SELECT "issues".* FROM "issues" WHERE "issues"."project_id" = 13083 AND ("issues"."state" IN ('opened')) ORDER BY "issues"."created_at" DESC, "issues"."id" DESC LIMIT 20 OFFSET 0

Limit  (cost=0.56..29.60 rows=20 width=767) (actual time=0.140..0.684 rows=20 loops=1)
  Buffers: shared hit=25
  ->  Index Scan Backward using index_issues_on_project_id_and_created_at_and_id_and_state on issues  (cost=0.56..23704.69 rows=16326 width=767) (actual time=0.140..0.680 rows=20 loops=1)
        Index Cond: ((project_id = 13083) AND ((state)::text = 'opened'::text))
        Buffers: shared hit=25
Planning time: 4.807 ms
Execution time: 0.716 ms

The ascending sort (https://gitlab.com/gitlab-org/gitlab-ce/issues?sort=created_asc) looks like this:

97.056ms    SELECT "issues".* FROM "issues" WHERE "issues"."project_id" = 13083 AND ("issues"."state" IN ('opened')) ORDER BY "issues"."created_at" ASC, "issues"."id" DESC LIMIT 20 OFFSET 0

and

/chatops run explain SELECT "issues".* FROM "issues" WHERE "issues"."project_id" = 13083 AND ("issues"."state" IN ('opened')) ORDER BY "issues"."created_at" ASC, "issues"."id" DESC LIMIT 20 OFFSET 0

Limit  (cost=24139.13..24139.18 rows=20 width=767) (actual time=106.307..106.317 rows=20 loops=1)
  Buffers: shared hit=17402
  ->  Sort  (cost=24139.13..24179.94 rows=16326 width=767) (actual time=106.306..106.312 rows=20 loops=1)
        Sort Key: created_at, id DESC
        Sort Method: top-N heapsort  Memory: 59kB
        Buffers: shared hit=17402
        ->  Index Scan using index_issues_on_project_id_and_created_at_and_id_and_state on issues  (cost=0.56..23704.70 rows=16326 width=767) (actual time=0.263..93.447 rows=16712 loops=1)
              Index Cond: ((project_id = 13083) AND ((state)::text = 'opened'::text))
              Buffers: shared hit=17396
Planning time: 4.913 ms
Execution time: 106.366 ms

The main difference seems to be that we changed the order direction for created_at to ASC, but we don't change the id - it stays as DESC.

I wonder if it's enough of a difference to consider changing the direction of the id order when the main attribute order is changed.


I'm not sure what the original intent was - if it was on purpose to always sort the matching records (such those with the same creation date, or a NULL field) as id DESC - meaning that the most recently created records would be first. However I think this is a little counter-intuitive to the user, who would expect them to be at the end if the sort was changed - the entire sort order of a list should change when direction is changed, not just a portion of the list.

I think it's it's a small piece of tech debt we can clean up and get a little more performance increase out of. But I don't think it'a high priority at the moment since it doesn't seem to be causing any problems with performance in general.


See the comment thread for additional details

Assignee Loading
Time tracking Loading