pg_statistic is heavily bloated in gprd-ci (Postgres)
(following up my comment in a recent gprd-ci postgres-checkup report)
It turned out pg_statistic
is heavily bloated in gprd-ci
:
\dt+ pg_statistic
List of relations
Schema | Name | Type | Owner | Persistence | Access method | Size | Description
------------+--------------+-------+-------------+-------------+---------------+---------+-------------
pg_catalog | pg_statistic | table | gitlab-psql | permanent | heap | 2387 MB |
(1 row)
Since postgres-checkup provides only bloat estimates, let's verify it in dblab:
exec vacuum full pg_statistic
Session: webui-i25014
The query has been executed. Duration: 2.308 s
\dt+ pg_statistic
List of relations
Schema | Name | Type | Owner | Persistence | Access method | Size | Description
------------+--------------+-------+-------------+-------------+---------------+-------+-------------
pg_catalog | pg_statistic | table | gitlab-psql | permanent | heap | 12 MB |
(1 row)
– indeed, bloated ~200x, even more than estimated. It's 99.5%.
And as expected, VACUUM FULL
takes just a couple of seconds.
What's next:
- Should we really care? (or: does it case any issues e.g. slowing down the planning time? ideally, we need to measure / quantify this). Things to check in terms of impact from this bloat:
- planning time
- ANALYZE duration
- extra RAM consumption in general
- If yes (might be not so), how to properly mitigate it? (we cannot run pg_repack on system catalogs, so this should be
vacuum full
, perhaps executed made with lowlock_timeout
and retries, during low-activity times)
Additionally:
- check it in other clusters as well (quick check for
gprd-main
:pg_statistic
is 333 MiB there) - check other system catalogs as well, e.g.
pg_class
Edited by Nikolay Samokhvalov