Skip to content

Let tie breaker order follow global sort direction

Andreas Brandl requested to merge ab/tie-breaker-order into master

What does this MR do?

We use a "tie breaker" for consistently ordering the results in the API. This is particularly relevant for cases where the order-direction is based on a non-unique column, for example name (for projects). Let's assume we want to order by name asc, then the tie breaker id desc is added to the query. This results in a query with ORDER BY name ASC, id DESC, such that the order is well defined.

Now, additionally to order_by options, we typically also allow to specify sort=asc|desc to define the direction. However, the tie breaker order is always the same: id DESC. This results in "mixed" directions like in the example above, where we order by name ASC and id DESC.

Consider we only had a "standard" index on name ASC, id ASC. We can't use it for name ASC, id DESC:

ORDER BY clause can use standard index?
name ASC, id ASC
name ASC, id DESC
name DESC, id ASC
name DESC, id DESC

In order to support these combinations, we have to have the following indexes:

Index Forward scan Backwards scan
name ASC, id ASC name ASC, id ASC name DESC, id DESC
name ASC, id DESC name ASC, id DESC name DESC, id ASC

With the tie breaker always defined as id DESC, we only need two scan types here - unfortunately each requires their own index.

With this change, we let the tie breaker follow the actual sorting direction. That is, we only have two combinations now:

  • name ASC, id ASC
  • name DESC, id DESC

Both are supported by a standard index (see above). We don't need another index on name ASC, id DESC.

Change in behavior

There is a change in behavior here: Let's use order_by=name as an example. This only applies for sort=asc: If there are records in a page with a duplicate name, those records are now going to show up in the reverse direction than before. The same applies to all other order_by options across the API.

There are no CI tests that specifically check for this behavior. I have not found any documentation that talks about it, too.

Relates to: #195881 (closed)

Next step

The next step is to validate that the targeted indexes are not needed anymore. We can do this by checking the index statistics on the database hosts:

gitlabhq_production=# select * from pg_stat_user_indexes where indexrelname ~* 'vis20.*desc';
 relid | indexrelid | schemaname | relname  |                   indexrelname                    | idx_scan | idx_tup_read | idx_tup_fetch 
-------+------------+------------+----------+---------------------------------------------------+----------+--------------+---------------
 33706 |  364202457 | public     | projects | index_projects_api_vis20_created_at_id_desc       |      260 |        35072 |         34712
 33706 |  364202526 | public     | projects | index_projects_api_vis20_last_activity_at_id_desc |     1684 |     11953789 |      11380596
 33706 |  364202598 | public     | projects | index_projects_api_vis20_updated_at_id_desc       |      214 |       132258 |        132254
 33706 |  364202675 | public     | projects | index_projects_api_vis20_name_id_desc             |      198 |       235620 |        235481
 33706 |  364202768 | public     | projects | index_projects_api_vis20_path_id_desc             |      241 |       256361 |        256297
(5 rows)

Those stats should not increase anymore after the change got deployed, assuming the indexes are only picked up for API-related queries.

Does this MR meet the acceptance criteria?

Conformity

Availability and Testing

The behavior we are changing here is not covered by existing specs. This MR adds specs for the helper but doesn't come with request spec changes.

Edited by 🤖 GitLab Bot 🤖

Merge request reports