Unique index on index_merge_request_diff_commit_users_on_name_and_email violation
We have a process which restores a backup to staging, which started failing after our last upgrade to 15.3.1. Emails redacted, although are commits are all public:
ERROR: could not create unique index "index_merge_request_diff_commit_users_on_name_and_email"
DETAIL: Key (name, email)=(Johan Kleene, joh[…]) is duplicated.
On production the same index is active:
gitlabhq_production=# \d merge_request_diff_commit_users
Table "public.merge_request_diff_commit_users"
Column | Type | Collation | Nullable | Default
--------+--------+-----------+----------+-------------------------------------------------------------
id | bigint | | not null | nextval('merge_request_diff_commit_users_id_seq'::regclass)
name | text | | |
email | text | | |
Indexes:
"merge_request_diff_commit_users_pkey" PRIMARY KEY, btree (id)
"index_merge_request_diff_commit_users_on_name_and_email" UNIQUE, btree (name, email)
Check constraints:
"check_147358fc42" CHECK (char_length(name) <= 512)
"check_f5fa206cf7" CHECK (char_length(email) <= 512)
"merge_request_diff_commit_users_name_or_email_existence" CHECK (COALESCE(name, ''::text) <> ''::text OR COALESCE(email, ''::text) <> ''::text)
However, there do seem to be duplicates:
gitlabhq_production=# SELECT name, substring(email, 1, 3), count(id) c FROM merge_request_diff_commit_users GROUP BY name, email HAVING count(id) > 1;
name | substring | c
-----------------------+-----------+---
Aaron Bauman | aar | 2
mohit.bansal623 | moh | 2
m.stenta | mst | 2
BramDriesen | bdr | 2
Prabhat Burnwal | pra | 2
Tilen Gombac | til | 2
DanielVeza | dan | 2
Igor Mashevskyi | igo | 2
anonymous | | 2
[…]
(45 rows)
The number of duplicates is increasing. 2 days before now, the above query only returned 26 rows.