Consider enabling data checksums in Postgres

(Previous discussions: https://gitlab.com/gitlab-com/gl-infra/reliability/-/issues/7244, https://gitlab.com/gitlab-com/gl-infra/reliability/-/issues/11534)

Currently, data checksums are off (how to check: show data_checksums;), which means that corruption at the level of page content can be unnoticed (just in case: WAL writes are covered, checksums in WALs are always on).

Considerations for discussion and action planning:

  1. Performance: of course, there is some performance penalty, but it is expected to be very low (worth double checking).
  2. Implementation options:
    1. If we upgrade to PG13 and/or new instance nodes using logical replication as currently being planned (POC: https://gitlab.com/gitlab-com/gl-infra/reliability/-/issues/16177), we should have a clear chance to enable it. However, our approach implies using a physical standby node to be converted to a logical replica – and in this case, we need to use pg_checksums anyway. Fortunately, it was included in the Postgres core package for version 12, so we can use it without third-party tooling (as was needed before 12): https://www.postgresql.org/docs/12/app-pgchecksums.html
    2. If we don't use logical at all, we still can use pg_checksums – first on a standby node, then performing switchover – in this case, we need to analyze why the previous attempt (https://gitlab.com/gitlab-com/gl-infra/reliability/-/issues/11534, production#2991 (closed)) wasn't finished.
Edited by Nikolay Samokhvalov