Postgres alignment padding – logic fixed
Following @abrandl's comment https://gitlab.com/gitlab-org/gitlab-ee/merge_requests/7302#note_101235676 I found this document describing PostgreSQL padding alignment.
While in general, it's very good and correct, I've noticed that boolean
data type is described as "not requiring alignment" and in general, reading the document, one might think that alignment for a column is determined only based on the data type of this column.
It is not so.
The alignment is based on the data type of the next column. boolean
columns do need padding, if the following column requires it and the current word is not yet "filled".
test=# create table a1 as select 1::bool, now() from generate_series(1, 100000);
SELECT 100000
test=# create table a2 as select 1::bool, 2::bool c2, now() from generate_series(1, 100000);
SELECT 100000
test=# create table a3 as select 1::bool, now(), 2::bool c2 from generate_series(1, 100000);
SELECT 100000
test=# \dt+ a*
List of relations
Schema | Name | Type | Owner | Size | Description
--------+------+-------+---------+---------+-------------
public | a1 | table | nikolay | 4328 kB |
public | a2 | table | nikolay | 4328 kB |
public | a3 | table | nikolay | 5096 kB |
(3 rows)
Here a1
and a2
occupy the same amount of space (in a1
we have 7 zeroes for padding in each tuple, in a2
– 6), while a3
requires more (7 + 7 = 14 zeroes for padding, the first one for padding before timestamptz
column, the second is for end-of-the-tuple padding; so each tuple needs 7 more bytes).
At the same time, when we add a new boolean
column, we don't need to pad the previously inserted data, because boolean
takes only 1 byte. (Yes, not 1 bit, 1 byte, sad news for many, but this is another story.)
This MR fixes it and also a few other inaccuracies.